c-w / gutenberg

A simple interface to the Project Gutenberg corpus.
Apache License 2.0
320 stars 60 forks source link

Cache creation warning, how to use BSB-DB? #101

Closed carlosalvidrez closed 6 years ago

carlosalvidrez commented 6 years ago

Hi, followed your command/script to get metadata...

from gutenberg.acquire import get_metadata_cache cache = get_metadata_cache() cache.populate

And I get this... so I wonder how to run it so that it stores the data faster in the DB. WARNING:root:Unable to create cache based on BSD-DB. Falling back to SQLite backend. Performance may be degraded significantly.

Thanks for sharing this!

c-w commented 6 years ago

Hi @carlosalvidrez. Thanks for reaching out and apologies for the delay in answering.

The message you're seeing happens because Gutenberg can't detect an install of the bsddb library on your system which is why as a fall-back SQLite is used.

Code to instantiate the cache:

https://github.com/c-w/gutenberg/blob/36cc023a722e5c6ffa84dd9e84992bd09fe633f1/gutenberg/acquire/metadata.py#L270-L280

Code to check if BSDDB can be used:

https://github.com/c-w/gutenberg/blob/36cc023a722e5c6ffa84dd9e84992bd09fe633f1/gutenberg/acquire/metadata.py#L190-L201

Which version of Python are you using (e.g. 2.7 or 3.5 or similar) and on which operating system are you running? In order to install BSDDB on your system, you'll have to follow instructions specific to your Python version and OS platform. Take a look here and let me know if you require more information: https://github.com/c-w/gutenberg#installation

carlosalvidrez commented 6 years ago

Thank you, all set.

johanwei commented 5 years ago

I'm having the same problem, how did you fix this?

carlosalvidrez commented 5 years ago

I unfortunately could not fix it and had to start from the beginning and wait for many hours for the cache to rebuild from scratch.

MasterOdin commented 5 years ago

@johanwei what version of Python are you using and what is your OS?

johanwei commented 5 years ago

Python 3.6 and macOS High Sierra.