ocdevel / gnothi

Gnothi is an open-source AI journal and toolkit for self-discovery. If you're interested in getting involved, we'd love to hear from you.
https://gnothiai.com
GNU Affero General Public License v3.0
174 stars 19 forks source link

Books: prevent and remove duplicates being saved to DB #115

Open lefnire opened 4 years ago

lefnire commented 4 years ago

Currently books recommender looks at Libgen database dump, finds matches, and saves the top k matches to the database. If a user thumbs a book, it gets perma-saved to the database; otherwise, next recommender run wipes the previous matches & re-saves new matches.

Libgen can have many duplicate books for a single book. This because there's different formats, editions, etc. I'm using the primary-key of book_id, which is Libgen-specific; not book-specific. Instead we should be using ISBN or another identifier as the primary_key to prevent duplicates from being saved on recommendation.