c-w / gutenberg

A simple interface to the Project Gutenberg corpus.
Apache License 2.0
322 stars 60 forks source link

get_metadata and get_etexts returns empty set #124

Closed bminixhofer closed 4 years ago

bminixhofer commented 5 years ago

Hi! get_metadata and get_etexts returns an empty frozen set for me. The cache is populated:

In [1]: from gutenberg.acquire import get_metadata_cache
INFO:rdflib:RDFLib Version: 4.2.2

In [2]: cache = get_metadata_cache()

In [3]: cache.exists
Out[3]: True

In [4]: cache.populate()
---------------------------------------------------------------------------
CacheAlreadyExistsException               Traceback (most recent call last)
<ipython-input-4-fce0c2e73ef1> in <module>()
----> 1 cache.populate()

~/miniconda3/lib/python3.6/site-packages/gutenberg/acquire/metadata.py in populate(self)
     86         """
     87         if self.exists:
---> 88             raise CacheAlreadyExistsException('location: %s' % self.cache_uri)
     89 
     90         self._populate_setup()

CacheAlreadyExistsException: location: /home/bminixhofer/gutenberg_data/metadata/metadata.db

And load_etext works as expected:

In [5]: from gutenberg.acquire import load_etext

In [6]: load_etext(2701)[:100]
Out[6]: '\r\nThe Project Gutenberg EBook of Moby Dick; or The Whale, by Herman\r\nMelville\r\n\r\nThis eBook is for t'

But get_metadata does nothing:

In [7]: from gutenberg.query import get_metadata

In [8]: get_metadata('author', 2701)
Out[8]: frozenset()

Note that yesterday, in the same Python session in which I populated the cache, it worked. I am using Linux with Ubuntu 16.04. Thanks in advance for any help!

c-w commented 5 years ago

Apologies for the late reply. I somehow missed notifications from this repository for a while. Were you able to get to the bottom of this issue? You could also try to use the hosted version of this API available at gutenberg.justamouse.com.

bminixhofer commented 5 years ago

No worries. I decided to use another corpus for my project, so I didn't investigate this issue further. I'll see if I can reproduce it with my current setup.

Is the hosted version suitable for downloading many books? I don't want to DOS your server ;)

c-w commented 4 years ago

Resolving this. Please feel free to reopen if you have any additional questions.