Open sarahyj opened 7 years ago
The subject of a sentence is a person, place, or thing. You can screen for those in compromise with .people().data() ,places().data() .nouns().data()
The subject could be a pronoun and that is harder to screen for. Find the main verb. Verbs preceded by "to" are infinitives. I've noticed compromise tends to label verbs freely as inifinitives when they are not. Watch for split infinitives. Inifinitives are not main verbs. Avoid potential subjects being part of the verb phrase of the main verb and find those part of a noun phrase.
@videophonegeek
I would recommend to use .topic()
which has also Organization
.
Couldn't it also be a Gerund
as in "Driving makes sebi tired." or an Infinitive-Construction with a Gerund as in "Making everybody happy is not easy" ?
[native german speaker here, not sure if it's same in english ...]
(groan) "Making everybody happy is not easy." It is correct English grammar and the grammar police will not come after you.
yeah @sebilasse good point. This is really something i want the library to start doing.
he was walking really fast
-> GerundVerb
walking is really fun
-> GerundNoun
I think there's a task for that somewhere, .. ah here.
...I have no idea how it would work, and would love some help.
I'm somewhat doing this. I restrict my interpreter to imperative sentences only, and so I look at any noun to the left of the verb (and restrict to one verb only) and view it as the subject of a sentence on my end. I also infer "I" as the subject of most sentences without an explicit subject. My usage is fairly narrow and covers text-based games out of the 1980's and 1990's.
Here's a portion of my TypeScript code for working with subject identification. Note that Command
is basically a wrapper object I have representing the sentence and CommandToken
is a similar wrapper built around compromise terms.
private identifySentenceNouns(command: Command, tokens: CommandToken[]): void {
let indexOfVerb: number = -1;
if (command.verb) {
indexOfVerb = tokens.indexOf(command.verb);
}
// Grab the nouns and stick them into the sentence as the objects
const nouns: CommandToken[] = tokens.filter(t => SentenceParserService.isNounLike(t));
for (const noun of nouns) {
// When no verbs are present and the first noun is a direction, interpret it as a 'Go' verb.
if (!command.verb && noun.classification === TokenClassification.Direction) {
command.verb = this.buildGoToken();
}
// If this noun comes before the verb, we're going to use it as a subject instead of as an object, but only for the first noun
if (!command.subject && indexOfVerb > tokens.indexOf(noun)) {
command.subject = noun;
} else {
command.objects.push(noun);
}
}
}
My full project is available at https://gitlab.com/IntegerMan/angularIF although the tests aren't implemented yet and there's still a ton of documentation, etc. to be done, but if it's helpful, check it out.
https://gitlab.com/IntegerMan/angularIF/blob/master/src/app/engine/parser/sentence-parser.service.ts in particular may be interesting to you.
@spencermountain our conversation earlier encouraged me to open up the source for you to take a peek at if you're interested.
hey, there is now a .subjects()
method in compromise-sentences. I'm not sure how well it is working, and would love some eyes on it.
WIll keep this open.
It looks like .subjects()
is no longer included in compromise-sentences
yet is still documented:
doc = nlp("Ecological rule that states that no two species can occupy the same exact niche in the same habitat at the same time.").sentences().subjects()
// Uncaught TypeError: nlp(...).sentences(...).subjects is not a function
The subject of a sentence is now calculated under the hood and exposed through .json()
:
doc = nlp("Ecological rule that states that no two species can occupy the same exact niche in the same habitat at the same time.").sentences().json()[0].sentence.subject
// 'ecological rule that'
I would never think that this would be possible without massive overhead. Kudos to the maintainers!
hey @MilkyDeveloper sorry about that. I'm not sure what happened to sentences().subject()
.
I recommend using verbs().subjects()
like this:
https://runkit.com/spencermountain/649460605ec8ad0009f65e68
The only difference is sentences().subject()
did some pretty weak analysis to find the 'main' verb in the sentence, whatever that means. Depending on your context, you may want more control over which verbs you get the subject of.
I can look at bringing back the function, but it still needs a smarter solution. I will remove it from the readme, for now. Any help is welcome cheers
Is it possible to detect the subject of the sentence using compromise? :)