Closed KeKs0r closed 4 months ago
Hi @KeKs0r , thanks for opening the issue. Let me think about it and come back to you.
In short, the main challenge is that MiniSearch internally uses short document IDs for various efficiency reasons. Currently, it does not keep a mapping from original ID to short ID (only the opposite way around), so it doesn't know how to add a field to an existing document identified by ID.
This can probably be implemented, making it possible to add or remove some fields to/from an existing document, but it should be done in a way that does not make the data structures much larger.
Note that updating in place instead, either whole document or just a field, will not possible. One has to first remove the old document (or field, if we implement this), then add the new one. While it can be cumbersome for the application developer to keep the old document around so it can be deleted, the alternative is worse: MiniSearch would have to copy each indexed field in each document (just referencing won't work, because the document can change in place), or at least keep the list of processed tokens for each indexed field, so it can de-index them upon removal/update. This would make MiniSearch use a lot more memory and be slower to index for everyone, even those who don't use this feature. It is also possible for application developers to implement it more efficiently, even though a bit cumbersome, by implementing some "copy on write" mechanism on the documents.
I think keeping the "old" document around in this case is not an issue. This is already the case for "data changes" anyways. There are 2 changes in regards to Columns
Is the original ID -> short ID function a hash or is it random? Also if it has the mapping shortID -> original ID, it is maybe not as efficient, but maybe its fine to do a full lookup of ids, only for this use case. This way, it would not effect memory consumption for all other use cases.
The new version v6.0.0-beta.1
includes changes that would make this feature possible in the near future (at least adding a field). I will consider this for 6.1.0
.
Hi Luca,
I mentioned this single use case in another ticket, where we discussed several things: https://github.com/lucaong/minisearch/issues/106
I wanted to create a dedicated issue for this specific case. I am powering a filter/search of a table with minisearch. And users are able to add and remove columns from that table. the recreation of the index is quite the performance hit on that page and I am trying to make reindexing more performant on data changes. One of these scenarios is adding of a column. Currently I am creating a complete new index, just to add or remove a single field.
It would be great, if there is an api to add or remove a single field from the index. Or event doing it for every individual document with the current value of the field would be fine.