Open GeorgeS2019 opened 8 months ago
Is it possible that you do not use the latest version of SimpleNLG-DE? There was in issue regarding the indexation of variants in previous versions that has been fixed (#2) in the latest version which could lead to this behaviour.
@DaBr01
I use SimpleNLG-DE v1.1.1 from Marven and the tests ported to c# against the ikvm SimpleNLG-DE version passed without problem but the performance of loading 42MB of MucLex.xml is a challenge using ikvm approach.
I use the codes from the existing master branch to port over to csharp.
I am still learning SimpleNLG-DE, and I have not yet completely tracked what was changed to create SimpleNLG-DE from the parent codes which is tailored for English
But if the tests pass, then the base form is the same, I am not sure I understand what the problem is?
The first approach: ikvm SimpleNLG-DE version involves SimpleNLG-DE.jar 1.1.1 from Marven without porting java to c#.
The performance is the issue.
The second approach looks into the existing java codes in the master branch and port that to c#. This approach promises far better performance of loading MucLex.xml than the first approach.
However, I need to track how SimpleNLG-DE java codes create the inflected WordElement (with the right baseform) used in the tests cited above.
I fail to port that as the second approach csharp ported version fails to provide the right baseform for the inflected words used in the tests above.
If you follow the issue I linked above you will find the exact commit where this was introduced / fixed in SimpleNLG-DE (commit d77058a)
@DaBr01 thanks, this is a good start to learn SimpleNLG-DE
@DaBr01 Good morning, the above tests passed now.
Question 1: which part of the codes deals with capitalization of the noun?
[Fact]
public void CreateAMoreComplexSentence1()
{
SPhraseSpec sentence = nlgFactory.createClause();
NPPhraseSpec subject = nlgFactory.createNounPhrase("der hund");
VPPhraseSpec verb = nlgFactory.createVerbPhrase("jagen");
NPPhraseSpec object1 = nlgFactory.createNounPhrase("george");
sentence.setSubject(subject);
sentence.setVerb(verb);
sentence.setObject(object1);
string output = realiser.realiseSentence(sentence);
Assert.Equal("Der Hund jagt George.", output);
}
I managed to have "jagt" from "jagen". However, I could not get capitalization of george => George and hund => Hund
Appreciate your help.
Uff I really don't know that by heart :) A search in the repo might be helpful: https://github.com/search?q=repo%3Asebischair%2FSimpleNLG-DE+capital&type=code
Looks like there is a function capitaliseFirstLetter in the OrthographyProcessor.
NOTE: this is ported to csharp
Question 1=> Please suggest which step of creating inflected wordElement this could go wrong such that the baseform is defined wrong?
https://github.com/sebischair/SimpleNLG-DE/blob/5c831cb9722406c749bc00bdd867e4d694e4bb4a/src/test/java/simplenlgde/morphology/BasewordTest.java#L51
baseformAdj2 remains "gute" instead of "gut"
https://github.com/sebischair/SimpleNLG-DE/blob/5c831cb9722406c749bc00bdd867e4d694e4bb4a/src/test/java/simplenlgde/morphology/BasewordTest.java#L64
baseformvp2 remains "bin" instead of "sein"