Open longrunningprocess opened 5 months ago
SELECT SyntacticName as part_of_speech,
FeatureName,
FeatureValues
FROM Features_Source
INNER JOIN SyntacticCategories
ON SyntacticCategory = SyntacticCategories.ID
WHERE FeatureName NOT LIKE "Spare%"
ORDER BY SyntacticCategory
part_of_speech | FeatureName | FeatureValues |
---|---|---|
Noun | Number | Singular/S|Dual/D|Trial/T|Quadrial/Q|Paucal/p|Plural/P\ |
Noun | Participant Tracking | First Mention/I|Routine/D|Integration/i|Exiting/E|Restaging/R|Offstage/O|Generic/G|Interrogative/Q|Frame Inferable/F|Unmarked/U\ |
Noun | Polarity | Affirmative/A|Negative/N\ |
Noun | Proximity | Not Applicable/n|Near Speaker and Listener/N|Near Speaker/S|Near Listener/L|Remote within Sight/R|Remote out of Sight/r|Temporally Near/T|Temporally Remote/t|Contextually Near with Focus/C|Contextually Near/c\ |
Noun | Future Expansion | Unspecified/K\ |
Noun | Person | First/1|Second/2|Third/3|First Inclusive/A|First Exclusive/B|First as Third/F|Second as Third/S|First Inclusive as Third/I|First Exclusive as Third/E\ |
Noun | Surface Realization | Noun/N|Always a Noun/A|PRO/p|Personal Pronoun/P|Reflexive Pronoun/R|Reciprocal Pronoun/r|Possessive Pronoun/a|Locative Pronoun/L|Relative Pronoun/D|Big Pro Plus/B|Conjoined Personal Pronoun/C\ |
Noun | Participant Status | Not Applicable/N|Protagonist/P|Antagonist/A|Major Participant/M|Minor Participant/m|Major Prop/p|Minor Prop/r|Significant Location/L|Insignificant Location/l|Significant Time/T|Emphasized/E\ |
Verb | Time | Past/Y|Future/Z|Present/P|Immediate Past/D|Earlier Today/A|Yesterday/a|2 Days Ago/b|3 Days Ago/c|A Week Ago/d|A Month Ago/e|A Year Ago/f|During Speaker's Lifetime/g|Historic Past/h|Eternity Past/i|Unknown Past/q|Discourse/r|Immediate Future/E|Later Today/F|Tomorrow/j|2 Days from Now/k|3 Days from Now/l|A Week from Now/m|A Month from Now/n|A Year from Now/o|During Speaker's Lifetime (future)/s|Unknown Future/p|Timeless/T\ |
Verb | Aspect | Inceptive/N|Completive/C|Cessative/c|Continuative/o|Imperfective/I|Routine/R|Habitual/H|Gnomic/G|Unmarked/U\ |
Verb | Mood | Indicative/I|Definite Potential/a|Probable Potential/b|'might' Potential/c|'must' Obligation/f|'should' Obligation/g|'may' (permissive)/l|'could' enablement/C\ |
Verb | Reflexivity | Not Applicable/N|Reciprocal/R|Reflexive/r\ |
Verb | Polarity | Affirmative/A|Negative/N|Emphatic Affirmative/E|Emphatic Negative/e\ |
Verb | Adjective Degree | No Degree/N|Comparative/C|Superlative/S|Intensified/I|Extremely Intensified/E|'too'/T|'less'/L|'least'/l\ |
Verb | Target Tense & Form | Unspecified/.|Past/P|Present/p|Future/F|"to"/t|"-ing"/i|Stem/N|"-en"/e\ |
Adjective | Degree | No Degree/N|Comparative/C|Superlative/S|Intensified/I|Extremely Intensified/E|'too'/T|'less'/L|'least'/l|Equality/q|Intensified Comparative/i|Intensified 'less'/c|Superlative of 2 items/s\ |
Adverb | Degree | No Degree/N|Comparative/C|Superlative/S|Intensified/V|Extremely Intensified/E|'too'/T|'less'/L|'least'/l\ |
Conjunction | Implicit | No/.|Yes/Y\ |
Bible
source? I'm just curious if these can change per project? If so, these can't be moved... if they can, why did they end up in the English
project?
- Is it ok to exclude the "Spare" rows?
I'm not sure at this point, we may find that it impacts the order/position of the other features, in which case we will need it.
- Should we also exclude Noun's "Future Expansion" row?
Same as above
- When parsing out the values, should "Verb Target Tense & Form"'s "Unspecified/." be excluded as well?
'Unspecified' should never be excluded, it is a valid and meaningful value.
- Is it ok to move this to the
Bible
source? I'm just curious if these can change per project? If so, these can't be moved... if they can, why did they end up in theEnglish
project?
Yeah, upon further reflection, some of these need to be tied to the English
project. The Features themselves (eg. Noun Proximity) are the same across all projects, but a target project can add values to each feature. The best example of this is 'Target Tense & Form', as all values except 'Unspecified' are unique to the project.
Most of these values are common though, and should be included within the Sources
db, as those values are used within the semantic representation. Any project-specific values would only be used within that project's grammar rules.
In addition, a target project may make use of the 'Spare' rows by renaming them a name and values. Again, these features will not appear in the semantic representation, but are only used within the grammar rules. See the following example from my Swahili project: Note all the 'Original...' columns. I think those are included so that the user can 'reset' a feature to its original state. And I think we can achieve a similar effect by having the common features in a separate db (ie the Sources db) from the project-specific ones (ie in the Targets db). We can hash that out more though.
do #8 first
Use Sample.mdb instead of English.mdb
formed from the pairing session in https://github.com/presciencelabs/tabitha-targets/issues/4#issuecomment-2125655542