By far one of the biggest issues that we have in our code now is that for each user on the dashboard the code will load and generate multiple duplicate parses on each reload of the page. Each time that the dashboard is reloaded, or the file is edited, it will then re-create the parse in-toto. As a consequence we have a large amount of unnecessary parsing and re-parsing taking place. This can be solved by developing a single document wrapper with indexing that can provide the only in-memory version of a document. This single document wrapper would need to be returned for all calls to retrieve the document or parse it, and it would need to be updated only when the document is changed or new features are requested.
Development of this can proceed in stages from having the thin wrapper to having the single to having it update cleanly on re-request.
Once we have developed this wrapper we can then set it up to update and also age off when it has not been requested for a fixed period of time or we exceed a storage window.
This work should examine what is done with the holmes extractor code as well. At present our sole purpose for that code is the document management. Once that is replaced inside of LO we can freely junk it.
By far one of the biggest issues that we have in our code now is that for each user on the dashboard the code will load and generate multiple duplicate parses on each reload of the page. Each time that the dashboard is reloaded, or the file is edited, it will then re-create the parse in-toto. As a consequence we have a large amount of unnecessary parsing and re-parsing taking place. This can be solved by developing a single document wrapper with indexing that can provide the only in-memory version of a document. This single document wrapper would need to be returned for all calls to retrieve the document or parse it, and it would need to be updated only when the document is changed or new features are requested.
Development of this can proceed in stages from having the thin wrapper to having the single to having it update cleanly on re-request.
Once we have developed this wrapper we can then set it up to update and also age off when it has not been requested for a fixed period of time or we exceed a storage window.
This work should examine what is done with the holmes extractor code as well. At present our sole purpose for that code is the document management. Once that is replaced inside of LO we can freely junk it.