Closed johnml1135 closed 1 year ago
When a source segment is too long, decoding can take a long time. We put in a hard limit of 200 words in order to be able to generate suggestions for a segment. If we increased the CPU on the server, then we might be able to increase the limit a bit.
The limit of 200 tokens is actually a aspect of the of NLLB 200 itself. I don’t think we can extend the token limit meaningfully, but rather we should split up the long segments.
Get Outlook for iOShttps://aka.ms/o0ukef
From: Damien Daspit @.> Sent: Thursday, August 10, 2023 4:59:37 PM To: sillsdev/serval @.> Cc: John Lambert @.>; Assign @.> Subject: Re: [sillsdev/serval] SF suggestions doesn't work with very long segments (Issue #76)
When a source segment is too long, decoding can take a long time. We put in a hard limit of 200 words in order to be able to generate suggestions for a segment. If we increased the CPU on the server, then we might be able to increase the limit a bit.
— Reply to this email directly, view it on GitHubhttps://github.com/sillsdev/serval/issues/76#issuecomment-1673911523, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADIY5NAQU24T4CG5UTH6D73XUVDTTANCNFSM6AAAAAA3I3EYMA. You are receiving this because you were assigned.Message ID: @.***>
Actually, in this case, I was referring to the SMT model, but yes, you are correct NLLB also has a limit of 200 tokens.
Sorry, I was misunderstanding the context.
If the CPU is the main thing limiting the word graph, I would be more inclined to pay for more CPU. A few hundred dollars a year is worth the increased performance. What do you think?
We can also use grafana to measure the word graph performance in production.
We would need to do testing to see how much we could increase the limit. At this point, SMT suggestions are not being used very much, so it isn't a priority to increase this limit.
This issue is related to https://github.com/sillsdev/machine.js/issues/19. After this bug is fixed, it may not be a big issue anymore.
There is something limiting this - for very long verses something pops up and says "this verse is too long for suggestions." Can we fix/resolve this?