zotero / translate

Browser standalone zotero translator
Other
17 stars 18 forks source link

Refactor. Translate 2.0 #30

Open adomasven opened 1 year ago

adomasven commented 1 year ago

After having a brief attempt at uncluttering the messy code in this codebase while trying to fix https://github.com/zotero/zotero/issues/3182, I have decided that it is pointless.

The actual translate flow is actually quite straightforward, but it is massively complicated by everything being super stateful, and execution flow being controlled by tens of flags set externally and internally.

There shouldn't be a need to instantiate any object when translating, instead, you should choose your Translate (Web, Export, Import, Search), then call translate(), detect() (for Web only?) and getTranslators(). 99% use cases would simply call translate(). If you pass in no translators into translate(), then it runs detect() before translation. If you pass in no translators into detect(), then it runs getTranslators() before translation.

All methods in current translation architecture that begin with set*() should become parameters that are passed into the execution calls above.

Events/handlers should likely disappear from translate at all. Translate should be strictly concerned with returning metadata from a given translate object. Any further processing like item saving or progress display should be performed by the translate calling code. This is simple to do in practice, since most of the saving functionality is already done in other classes per current architecture, specifically Translate.ItemSaver, which all environments that use Translate override.

I propose rewriting the translate code from scratch and providing it as a separate file, then gradually exchanging all existing codebases depending on Translate to use the new architecture, and finally phasing out Translate 1.0 after some years and allowing others to switch over.

This would probably be a huge timesaver going forward, since every bug we encounter in translation, and every time we need to change something in the logic usually means a huge amount of time spent analyzing this complex piece of code and further debugging when a change produces unexpected behavior.

CC @dstillman @AbeJellinek @zoe-translates

AbeJellinek commented 1 year ago

Things we've discussed:

Other potential considerations:

adomasven commented 1 year ago

Case in point https://github.com/zotero/translate/commit/d7299a7969a327e71f0f8b509489167777c382a6

I've spent a lot of time troubleshooting this and trying to come up with solutions, which in the end turned out just removing error handling there completely. translate.complete(false) is an absolute disaster of the current codebase. E.g.:

https://github.com/zotero/translate/blob/d7299a7969a327e71f0f8b509489167777c382a6/src/translation/translate.js#L1683-L1697

Before the above change this._loadTranslator() may have called this.complete(false), effectively meaning that the current translate operation should be terminated, but since it wasn't throwing, L1692 continued happily executing, to make matters worse, for Import translators, _loadTranslator() is also overloaded:

https://github.com/zotero/translate/blob/d7299a7969a327e71f0f8b509489167777c382a6/src/translation/translate.js#L2352-L2357

So the Base.prototype._loadTranslator() calls this.complete(false), but then this overloaded codepath continues executing, and then the codepath in _detect() also continues executing.


A bug like this shouldn't even exist in the first place if the code was architectured correctly, but even if it did, with proper execution flow figuring out the issue and fixing it should be a matter of minutes, not hours.

AbeJellinek commented 1 year ago

Horrifying!