Closed kskyten closed 6 years ago
sorry for the delay. I think it is a very good idea. I think I'll give it a try over christmas probably.
Hi everyone again,
some news about what I've been working on to address this issue.
All this is in the branch database
There is a new papis
module called database
. Now, since
the previous way of caching the documents was actually a
purelly ad-hoc solution since papis was not intended to be
used with a database, some work had to be done to actually
adopt a database model, which might be still quite crude,
that's why I'd like to discuss with you guys.
Now papis to interact with documents should always involve a database object managing the documents.
Now there are two databases in place all having an api to the rest of the papis code. The common api is logically
class Database:
def __init__(self, library=papis.config.get_lib()):
def get_lib(self):
def get_dir(self):
def match(self, document, query_string):
def clear(self):
def add(self, document):
def update(self, document):
def delete(self, document):
def query(self, query_string):
I have gathered all previous functions and sticked them
inside the papis.database.cache.Database
database.
There is an implementation too using whoosh (which is
much much faster) in papis.database.whoosh.Database
which can be selected (of course also on a library basis)
through the otion database-backend = whoosh
.
Only the rudimentary usage of whoosh is there now, that's why I'm asking if someone is interested in taking a look, and maybe learn more about whoosh and improve it, also try it. Maybe it would be a good idea to include whoosh in the next version.
Whoosh has a way more powerful query language, which supports ands and ors, i.e.
papis open author:einstein OR author:heisenberg
etc... Whoosh has really a lot of features. This means, should we use this query language for everywhere where a query language is used in papis ?
Should we kill the papis cache library ? Or leave it in place for trivial libraries and small libraries ? Or when whoosh is not available (although whoosh is pure python, which is very nice, and installing it is a charm). Maybe just using whoosh altogether and kill everything else is the way to go also to simplify the code.
Right now there are two main points in the querying. On the one hand there is the database querying, which in the case of whoosh goes like
papis open 'author:einstein OR year:1923 AND title:physik'
and then, when the database returns the documents through
the query
method, the papis picker will let the user
pick between them. This means that there is an input for the
picker, in the case of rofi
, the picking is done by rofi,
in the case of papis.pick
(curses), the picking is done
through the config option match-format
and fuzzy matching,
in the case of dmenu
through dmenu's fuzzy system etc...
So we have 2 things, querying and picking, Is this confusing? Users will be able to differentiate between both? This is very relevant for the web application for instance @PatWie.
This is a lot of information to digest, so I'll let you guys digest it for a while and if you want we can discuss a roadmap.
Thank you all !
This is now solved with version v0.6
Using Whoosh would enable a more powerful query language and probably make the queries more performant. How is the querying done currently and what would need to be done to add whoosh?