Open ERJavier opened 11 months ago
Hi
Thanks for raising this issue! Apologies for the extremely late reply, I've been slowly working on a refinement to the loading process, and it's been exceptionally long in getting time, plus waiting for the database to build and then try again, etc!
You're right that the database isn't optimised. Originally it was designed to take an existing identifier (mainly ISBN), and good text searching hasn't been done.
Trying your own steps, I get:
SELECT data FROM authors WHERE data ->> 'name' ILIKE 'Graham Greene'
~ 15 seconds
On the second example you don't need to query the JSON data within the works table because the author key has been linked as an indexed identifier through the author_works table. For example, using a key that I obtained above:
select *
from works w
join author_works aw
on w.key = aw.work_key
where author_key = '/authors/OL6856462A'
~0.3 seconds
If I can get full text searching on some key fields (like author name), the abive steps should all be able to be done in one single query. I'll create a branch for that under this issue
Even after optimizing my Python code and database configuration. Queries are taking over 2 minutes and 30 seconds to execute.
I am running the following SQL queries:
SELECT data FROM authors WHERE data ->> 'name' ILIKE %s
SELECT * FROM works WHERE data-> 'authors' @> '[{{\"author\": {{\"key\": \"{author_key}\"}}}}]'
Environment Details
I have already tried the following:
I expected both queries to execute much faster, ideally within a few seconds.
Thank you 🙏