UAlbertaALTLab / morphodict

Plains Cree Intelligent Dictionary
https://itwewina.altlab.app/
Apache License 2.0
22 stars 11 forks source link

change: uncomment line of code to include preverbs in espt #1048

Closed nienna73 closed 2 years ago

nienna73 commented 2 years ago

I uncommented a line of code to include preverbs in the translation process again.

@aarppe from what I can tell, this makes the search functionality a lot slower, not the import process like I thought.

The query you will dance espt:1 auto:1 verbose:1 takes about 13 seconds to return results with this change, whereas it only takes about 1-2 seconds on production right now. The results, as far as I can tell, are the exact same.

Feel free to pull this branch and look at it locally, but I'm not sure if this change is worth it at this time.

aarppe commented 2 years ago

Since the English FST we use to analyze the phrase is the same, regardless of whether we make use of the preverbed cases, and since the English FST we use to generate the translations is the same, regardless of whether we exclude the preverbed cases, and those should be pregenerated, I have a hard time understanding how the implementation would slow up so much. When using the verbose option, the analysis of the phrase and its conversion to Cree features happens quickly, and accessing English phrases if pre-generated should not take that long. So, we'd need to eventually explore in more depth why this takes so long.

Nevertheless, in some way, having either espt or auto work separately, but not at the same time, isn't necessarily the greatest loss - but not being able to get preverbed Cree wordforms as translations is undesirable.

image
codecov[bot] commented 2 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 78.56%. Comparing base (345a10d) to head (14a851a). Report is 608 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1048 +/- ## ========================================== - Coverage 78.67% 78.56% -0.12% ========================================== Files 151 151 Lines 5341 5341 Branches 707 707 ========================================== - Hits 4202 4196 -6 - Misses 1009 1014 +5 - Partials 130 131 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.