openzim / gutenberg

Scraper for downloading the entire ebooks repository of project Gutenberg
https://download.kiwix.org/zim/gutenberg
GNU General Public License v3.0
128 stars 37 forks source link

Using --books only should automatically do everything #14

Closed kelson42 closed 7 years ago

kelson42 commented 9 years ago

Adding the list of additional steps to do should be optional, without any other addition stuff all the steps should be done.

Currently it dies: ./dump-gutenberg.py -k --books=10003 CHECKING for dependencies on the system PREPARING rdf-files cache from http://www.gutenberg.org/cache/epub/feeds/rdf-files.tar.bz2 df-files.tar.bz2 already exists in rdf-files.tar.bz2 RDF-files folder already exists in rdf-files PARSING rdf-files in rdf-files Setting up the database license table already exists. format table already exists. author table already exists. book table already exists. bookformat table already exists. Looping throught RDF files in rdf-files Parsing file rdf-files/10003/pg10003.rdf Traceback (most recent call last): File "./dump-gutenberg.py", line 150, in main(docopt(help, version=0.1)) File "./dump-gutenberg.py", line 121, in main parse_and_fill(rdf_path=RDF_FOLDER, only_books=BOOKS) File "/media/data/projs/gutenberg/gutenberg/rdf.py", line 76, in parse_and_fill parse_and_process_file(fpath) File "/media/data/projs/gutenberg/gutenberg/rdf.py", line 94, in parse_and_process_file save_rdf_in_database(parser) File "/media/data/projs/gutenberg/gutenberg/rdf.py", line 215, in save_rdf_in_database downloads=parser.downloads File "/home/kelson/.virtualenvs/gut/local/lib/python2.7/site-packages/peewee.py", line 3038, in create inst.save(force_insert=True) File "/home/kelson/.virtualenvs/gut/local/lib/python2.7/site-packages/peewee.py", line 3163, in save pk_from_cursor = self.insert(*_field_dict).execute() File "/home/kelson/.virtualenvs/gut/local/lib/python2.7/site-packages/peewee.py", line 2247, in execute return self.database.last_insert_id(self._execute(), self.model_class) File "/home/kelson/.virtualenvs/gut/local/lib/python2.7/site-packages/peewee.py", line 1838, in _execute return self.database.execute_sql(sql, params, self.require_commit) File "/home/kelson/.virtualenvs/gut/local/lib/python2.7/site-packages/peewee.py", line 2414, in execute_sql self.commit() File "/home/kelson/.virtualenvs/gut/local/lib/python2.7/site-packages/peewee.py", line 2283, in exit reraise(new_type, new_type(_exc_value.args), traceback) File "/home/kelson/.virtualenvs/gut/local/lib/python2.7/site-packages/peewee.py", line 2406, in execute_sql cursor.execute(sql, params or ()) peewee.IntegrityError: UNIQUE constraint failed: book.id