mfarragher / obsidiantools

Obsidian tools - a Python package for analysing an Obsidian.md vault
Other
402 stars 28 forks source link

Incremental refresh #23

Open louis030195 opened 1 year ago

louis030195 commented 1 year ago

Hey, for https://github.com/louis030195/obsidian-ava, I'm trying to implement increment refresh of the state of the vault.

Concretely, I build sentence embeddings of the whole vault and would like to re-compute embeddings every time a note is updated/deleted/created.

Do you see any way of doing this incrementally rather than reloading the vault and recomputing everything every time? (It takes ~1 min on mps device on my 500k words vault)

Ideally, I'd see maybe an API that let me listen to vault changes with callback(s) in this library?

Thanks 🚀😃

mfarragher commented 1 year ago

These are a few API changes that I think can enable this:

I can see how incremental refresh would be useful for NLP analytics. With that there would only be the dictionaries of text to be updated. Incremental refresh to update info populated via connect() seems complex in contrast.

When looking at libraries that monitor file changes in a directory, there are a few out there, but they don't seem to be cross-platform. They seem Linux-focused. I won't add functionality for this in obsidiantools via a new method or class, but think it'll be interesting to see what recipes can be made for this.

mfarragher commented 1 year ago

So far I've added this to the code: https://github.com/mfarragher/obsidiantools/commit/5b73e6f7d50a475adcd1f12023e1ba4b42626777

I have also been prototyping some code to listen to file changes with another library. More work is needed to play around with async for the attr updates.

mfarragher commented 1 year ago

More attr setters are available since this commit: https://github.com/mfarragher/obsidiantools/commit/6880842c62b297d52e1a54c9ca08f3eab0edec02