unfoldingWord / proskomma-react-hooks

A collection of React Hooks that provide a way to simplify the implementation of Proskomma into your React projects.
https://proskomma-react-hooks.netlify.app/
MIT License
1 stars 2 forks source link

USFM strings permanently in scope in useProskomma? #2

Closed mvahowe closed 2 years ago

mvahowe commented 3 years ago

Discussion on #1 helped me to notice how useProskomma is managing document state. As @klappy says: "Update the documents array and the import will run".

I think this means that all imported documents are in scope, as USFM strings, for the lifetime of the hook? This raises several concerns for me:

  1. This mechanism at least doubles the RAM footprint, since each document is stored both as a string and within Proskomma. (In the case of syntax trees that's potentially a very big deal.)
  2. In some cases (including the current version of Chaliki) there are no USFM strings, because Chaliki uses the serialized format. In other cases the document content would be USX or TSV or something else.
  3. If/when we want to use verse mapping, we need to import VRS files which attach to already-imported documents.

The underlying issue, I think, is that the current hook uses Proskomma as if it's a function that performs transforms on an array of import documents. But Proskomma - like most models - is very, very stateful.

If we replaced Proskomma with PostgreSQL, we wouldn't consider storing all the DBMS content in an array in the view. We'd use a combination of select, insert, update and drop queries to read and write state. I think we should be aiming for something similar for Proskomma.

mvahowe commented 3 years ago

I think we could aim for an interface something like the current one, which would work somewhat differently under the hood. Instead of an array of USFM strings, you'd have an array of document ids, or maybe of docSet/bookCode tuples, eg

[ ["en_ust", "GEN"], ["en_ust", "EXO"] ]

(or you could nest all the document codes inside each element, or whatever...)

To add a document you perform a mutation (via a hook), which triggers a query which rebuilds that array.

To remove a document you setState() that array, at which point a hook queries Proskomma, compares it with the new Array and removes documents and/or docSets as necessary.

This doesn't support multi-agent changes, but since that's currently not needed in most places that's probably ok. But I think it does address the memory bloat concern, and does look like cleaner MVP separation to me.

mvahowe commented 3 years ago

I think this is how, in one of my early attempts to do something like this, I ended up with a to be imported queue that was consumed by Proskomma. So you still provide an array of content (some of which might be in the form of URLs), but you only keep that array for as long as it takes to do the import.

klappy commented 3 years ago

Thanks Mark!

The current way import is run is only temporary as it is not optimized at all but only bare minimum to get the job done.

Storing and comparing USFM strings is heavy handed, it does takes longer to parse and uses way more memory, I’m thinking at least 3x. Proskomma, previous state’s copy and next state’s copy since there is a comparison.

As for the usage of Proskomma and comparing how it’s used, that’s good we are finding the tension to work out the design principles now. This is built based on how I would expect to use it as a dev new to PK. We may have to iterate a few more times to expose all features and functions of PK via hooks.

If you can contrive some tests for each use case above on new issues we can work on architecting the hooks design pattern to accommodate for each.

mvahowe commented 3 years ago

I think that the tension we are finding is potentially useful!

To boil it right down, I think that import has to be treated as a stateful operation, and that how we get that state and, eventually, modify that state needs to be separate.

Also, I think that using serialized USFM as a key is going to be problemmatic because Proskomma makes no guarantee to ever roundtrip non-semantic formatting in USFM.

klappy commented 2 years ago

This should now be addressed in v1.x via separating importing of documents from useProskomma into useImport. There is no longer any documents available in the state of the React hooks.