wpoa / open-access-media-importer

A tool for harvesting media files from Open Access articles for upload into Wikimedia Commons
http://commons.wikimedia.org/wiki/User:Open_Access_Media_Importer_Bot
23 stars 8 forks source link

Converter stalls occasionally #18

Open Daniel-Mietchen opened 11 years ago

Daniel-Mietchen commented 11 years ago

Affected DOIs (UPDATE: complete list as of Nov 16 is at https://github.com/erlehmann/open-access-media-importer/issues/18#issuecomment-10434886 ): 10.1371/journal.pone.0026598 10.1371/journal.pbio.0060198 10.1371/journal.pone.0042698 10.1186/1471-2202-11-110 10.1186/1472-6807-7-71 10.1186/1471-2202-10-22 10.1186/1471-2377-10-37 10.1371/journal.pone.0016734 10.1371/journal.pone.0018056 10.1371/journal.pcbi.1001040 10.1371/journal.pone.0008511 10.1371/journal.pone.0047486

erlehmann commented 11 years ago

Any output?

Daniel-Mietchen commented 11 years ago

Normally no output other than the stalled progress bar. However, just had one where it actually said "aborted" (never seen this before): 10.1371/journal.pone.0000794.

Daniel-Mietchen commented 11 years ago

Here is an error message for 10.1371/journal.pone.0014414 :

Removing “/home/daniel/.local/share/open-access-media-importer/pmc_doi.sqlite” … done. Input DOIs, delimited by whitespace: Getting PubMed Central IDs for given DOIs … found: 3011000 Downloading “http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pmc&id=3011000”, saving into directory “/home/daniel/.cache/open-access-media-importer/metadata/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| /usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py:463: SAWarning: Unicode type received non-unicode bind param value. param.append(processorskey) “Fibroblast Growth Factor-10 Promotes Cardiomyocyte Differentiation from Embryonic and Induced Pluripotent Stem Cells”: 1 × application/vnd.ms-excel 3 × image/tiff 4 × video/x-msvideo

Checking MIME types … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3011000/bin/pone.0014414.s005.avi, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3011000/bin/pone.0014414.s006.avi, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3011000/bin/pone.0014414.s007.avi, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3011000/bin/pone.0014414.s008.avi, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Converting “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0014414.s005.avi”, saving into “/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/pone.0014414.s005.avi.ogv” … \ (oa-cache:20395): WARNING **: ffmpegcsp0: size 305152 is not a multiple of unit size 306176

Daniel-Mietchen commented 11 years ago

10.1371/journal.pone.0014414 again: Converting “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0014414.s005.avi”, saving into “/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/pone.0014414.s005.avi.ogv” … \ (oa-cache:6877): WARNING **: ffmpegcsp0: size 305152 is not a multiple of unit size 306176

Daniel-Mietchen commented 11 years ago

Some more, with no error message: 10.1371/journal.pone.0017314 10.1371/journal.pone.0004960 10.1371/journal.pone.0030182 10.1371/journal.ppat.1002280 10.1371/journal.pone.0019812 10.1371/journal.pone.0017314 10.1371/journal.pone.0038027 10.1371/journal.pone.0038228 10.1371/journal.pone.0042698.

erlehmann commented 11 years ago

Suggested Timeout: If GStreamer is not answering for 5 minutes, break.

Daniel-Mietchen commented 11 years ago

The complete list of all stalling DOIs reported so far:

10.1186/1471-2202-10-22 10.1186/1471-2202-11-110 10.1186/1471-2377-10-37 10.1186/1472-6807-7-71 10.1186/1746-4358-4-4 10.1186/1476-7120-2-15 10.1186/1476-7120-7-44 10.1186/1476-7120-9-8 10.1186/1749-7221-5-7 10.1186/1749-8104-2-15 10.1186/1750-1326-5-41 10.1186/1756-3305-5-179 10.1186/1757-1626-2-193 10.1186/bcr2358 10.1186/cc6118 10.1371/journal.pbio.0060198 10.1371/journal.pcbi.1001040 10.1371/journal.pgen.1000484 10.1371/journal.pone.0000794 10.1371/journal.pone.0004536 10.1371/journal.pone.0004938 10.1371/journal.pone.0004960 10.1371/journal.pone.0008511 10.1371/journal.pone.0013124 10.1371/journal.pone.0014414 10.1371/journal.pone.0016734 10.1371/journal.pone.0017314 10.1371/journal.pone.0019733 10.1371/journal.pone.0019812 10.1371/journal.pone.0018056 10.1371/journal.pone.0020485 10.1371/journal.pone.0023573 10.1371/journal.pone.0023667 10.1371/journal.pone.0023808 10.1371/journal.pone.0026598 10.1371/journal.pone.0030182 10.1371/journal.pone.0032554 10.1371/journal.pone.0038027 10.1371/journal.pone.0038228 10.1371/journal.pone.0042114 10.1371/journal.pone.0042698 10.1371/journal.pone.0047486 10.1371/journal.ppat.1002280 10.1371/journal.ppat.1003058

Done:

10.1371/journal.ppat.1003023

erlehmann commented 11 years ago

Probably fixed with 7c56d57a42b3c278c6da9792a214073a5e39a554, conversion now fails if there the conversion pipeline stalls for 10 seconds.

Daniel-Mietchen commented 11 years ago

daniel@oami-host:~/open-access-media-importer$ git pull Already up-to-date. daniel@oami-host:~/open-access-media-importer$ echo 10.1371/journal.pone.0026598 | ./oami_pmc_doi_import Removing “/home/daniel/.local/share/open-access-media-importer/pmc_doi.sqlite” … done. Input DOIs, delimited by whitespace: Getting PubMed Central IDs for given DOIs … found: 3202553 Downloading “http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pmc&id=3202553”, saving into directory “/home/daniel/.cache/open-access-media-importer/metadata/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| /usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py:463: SAWarning: Unicode type received non-unicode bind param value. param.append(processorskey) “Social and Nonsocial Content Differentially Modulates Visual Attention and Autonomic Arousal in Rhesus Macaques”: 6 × video/quicktime 3 × application/msword

Checking MIME types … 100% |##########################################################################################################################################################################################################| Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s004.mov. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s005.mov. Traceback (most recent call last): File "./oa-get", line 132, in license_url = material.article.license_url File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/attributes.py", line 168, in get return self.impl.get(instancestate(instance),dict) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/attributes.py", line 453, in get value = self.callable_(state, passive) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/strategies.py", line 490, in _load_for_state passive File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/strategies.py", line 531, in _get_ident_for_use_get for pk in self.mapper.primary_key File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/mapper.py", line 1641, in _get_state_attr_bycolumn return state.manager[prop.key].impl.get(state, dict, passive=passive) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/attributes.py", line 451, in get value = callable_(passive) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/state.py", line 285, in call self.manager.deferred_scalar_loader(self, toload) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/mapper.py", line 1714, in _load_scalar_attributes raise orm_exc.ObjectDeletedError(state) sqlalchemy.orm.exc.ObjectDeletedError: Instance '<SupplementaryMaterial at 0x1891c90>' has been deleted, or its row is otherwise not present. daniel@oami-host:~/open-access-media-importer$ echo 10.1371/journal.pone.0026598 | ./oami_pmc_doi_import Removing “/home/daniel/.local/share/open-access-media-importer/pmc_doi.sqlite” … done. Input DOIs, delimited by whitespace: Getting PubMed Central IDs for given DOIs … found: 3202553 Downloading “http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pmc&id=3202553”, saving into directory “/home/daniel/.cache/open-access-media-importer/metadata/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| /usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py:463: SAWarning: Unicode type received non-unicode bind param value. param.append(processorskey) “Social and Nonsocial Content Differentially Modulates Visual Attention and Autonomic Arousal in Rhesus Macaques”: 6 × video/quicktime 3 × application/msword

Checking MIME types … 100% |##########################################################################################################################################################################################################| Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s004.mov. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s005.mov. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s006.mov. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s007.mov. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s008.mov. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s009.mov. Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2841623/bin/pgen.1000884.s005.avi, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2841623/bin/pgen.1000884.s006.avi, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Converting “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pgen.1000884.s005.avi”, saving into “/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/pgen.1000884.s005.avi.ogv” … 8% |################# done.|#################################################################################################################################################################################################### | Converting “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pgen.1000884.s006.avi”, saving into “/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/pgen.1000884.s006.avi.ogv” … 7% |############## Traceback (most recent call last):######################################################################################################################################################################### | File "./oa-cache", line 134, in f = mutagen.oggtheora.OggTheora(temporary_media_path) File "/usr/lib/python2.7/dist-packages/mutagen/init.py", line 73, in init self.load(filename, _args, *_kwargs) File "/usr/lib/python2.7/dist-packages/mutagen/ogg.py", line 438, in load fileobj = file(filename, "rb") IOError: [Errno 2] No such file or directory: '/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/current.ogv' daniel@oami-host:~/open-access-media-importer$ echo 10.1371/journal.pone.0026598 | ./oami_pmc_doi_import Removing “/home/daniel/.local/share/open-access-media-importer/pmc_doi.sqlite” … done. Input DOIs, delimited by whitespace: Getting PubMed Central IDs for given DOIs … found: 3202553 Downloading “http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pmc&id=3202553”, saving into directory “/home/daniel/.cache/open-access-media-importer/metadata/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| /usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py:463: SAWarning: Unicode type received non-unicode bind param value. param.append(processorskey) “Social and Nonsocial Content Differentially Modulates Visual Attention and Autonomic Arousal in Rhesus Macaques”: 6 × video/quicktime 3 × application/msword

Checking MIME types … 100% |##########################################################################################################################################################################################################| Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s004.mov. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s005.mov. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s006.mov. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s007.mov. Traceback (most recent call last): File "./oa-get", line 164, in session.commit() File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/scoping.py", line 114, in do return getattr(self.registry(), name)(_args, *_kwargs) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 656, in commit self.transaction.commit() File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 314, in commit self._prepare_impl() File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 298, in _prepare_impl self.session.flush() File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1583, in flush self._flush(objects) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1654, in _flush flush_context.execute() File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 331, in execute rec.execute(self) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 475, in execute uow File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 59, in save_obj mapper, table, update) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 485, in _emit_update_statements execute(statement, params) File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1449, in execute params) File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1584, in _execute_clauseelement compiled_sql, distilled_params File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1698, in _execute_context context) File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1691, in _execute_context context) File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 331, in do_execute cursor.execute(statement, parameters) sqlalchemy.exc.OperationalError: (OperationalError) disk I/O error u'UPDATE model_supplementarymaterial SET downloaded=? WHERE model_supplementarymaterial.url = ?' (1, u'http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s007.mov') daniel@oami-host:~/open-access-media-importer$ echo 10.1371/journal.pone.0026598 | ./oami_pmc_doi_import Removing “/home/daniel/.local/share/open-access-media-importer/pmc_doi.sqlite” … done. Input DOIs, delimited by whitespace: Getting PubMed Central IDs for given DOIs … found: 3202553 Downloading “http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pmc&id=3202553”, saving into directory “/home/daniel/.cache/open-access-media-importer/metadata/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| /usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py:463: SAWarning: Unicode type received non-unicode bind param value. param.append(processorskey) “Social and Nonsocial Content Differentially Modulates Visual Attention and Autonomic Arousal in Rhesus Macaques”: 6 × video/quicktime 3 × application/msword

Checking MIME types … 100% |##########################################################################################################################################################################################################| Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s004.mov. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s005.mov. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s006.mov. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s007.mov. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s008.mov. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s009.mov. Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877737/bin/pgen.1000972.s015.avi, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877737/bin/pgen.1000972.s016.avi, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877737/bin/pgen.1000972.s017.avi, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877737/bin/pgen.1000972.s018.avi, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877737/bin/pgen.1000972.s019.avi, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877737/bin/pgen.1000972.s020.avi, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877737/bin/pgen.1000972.s021.avi, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Converting “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pgen.1000972.s015.avi”, saving into “/home/daniel/.cache/open-access-media-importer/media/refined/pmcdoi/pgen.1000972.s015.avi.ogv” … * (oa-cache:11858): WARNING : ffmpegcsp0: size 958976 is not a multiple of unit size 960000 ERROR: GStreamer encountered a general stream error. done. Converting “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pgen.1000972.s016.avi”, saving into “/home/daniel/.cache/open-access-media-importer/media/refined/pmcdoi/pgen.1000972.s016.avi.ogv” … * (oa-cache:11858): WARNING _: ffmpegcsp1: size 958976 is not a multiple of unit size 960000 ERROR: GStreamer encountered a general stream error. done. Converting “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pgen.1000972.s017.avi”, saving into “/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/pgen.1000972.s017.avi.ogv” … (oa-cache:11858): WARNING *: ffmpegcsp2: size 958976 is not a multiple of unit size 960000 ERROR: GStreamer encountered a general stream error. done. Converting “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pgen.1000972.s018.avi”, saving into “/home/daniel/.cache/open-access-media-importer/media/refined/pmcdoi/pgen.1000972.s018.avi.ogv” … * (oa-cache:11858): WARNING _: ffmpegcsp3: size 958976 is not a multiple of unit size 960000 ERROR: GStreamer encountered a general stream error. done. Converting “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pgen.1000972.s019.avi”, saving into “/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/pgen.1000972.s019.avi.ogv” … (oa-cache:11858): WARNING *: ffmpegcsp4: size 958976 is not a multiple of unit size 960000 ERROR: GStreamer encountered a general stream error. done. Converting “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pgen.1000972.s020.avi”, saving into “/home/daniel/.cache/open-access-media-importer/media/refined/pmcdoi/pgen.1000972.s020.avi.ogv” … * (oa-cache:11858): WARNING _: ffmpegcsp5: size 958976 is not a multiple of unit size 960000 ERROR: GStreamer encountered a general stream error. done. Converting “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pgen.1000972.s021.avi”, saving into “/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/pgen.1000972.s021.avi.ogv” … _ (oa-cache:11858): WARNING **: ffmpegcsp6: size 958976 is not a multiple of unit size 960000 ERROR: GStreamer encountered a general stream error. done. Skipping “/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/pone.0026598.s004.mov.ogv”, already exists at http://commons.wikimedia.org/w/api.php. Skipping “/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/pone.0026598.s005.mov.ogv”, already exists at http://commons.wikimedia.org/w/api.php. Skipping “/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/pone.0026598.s006.mov.ogv”, already exists at http://commons.wikimedia.org/w/api.php. Skipping “/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/pone.0026598.s007.mov.ogv”, already exists at http://commons.wikimedia.org/w/api.php. Skipping “/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/pone.0026598.s008.mov.ogv”, already exists at http://commons.wikimedia.org/w/api.php. Skipping “/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/pone.0026598.s009.mov.ogv”, already exists at http://commons.wikimedia.org/w/api.php. daniel@oami-host:~/open-access-media-importer$ echo 10.1371/journal.pbio.0060198 | ./oami_pmc_doi_import Removing “/home/daniel/.local/share/open-access-media-importer/pmc_doi.sqlite” … done. Input DOIs, delimited by whitespace: Getting PubMed Central IDs for given DOIs … found: 2494560 Downloading “http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pmc&id=2494560”, saving into directory “/home/daniel/.cache/open-access-media-importer/metadata/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| /usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py:463: SAWarning: Unicode type received non-unicode bind param value. param.append(processorskey) “A Novel Molecular Solution for Ultraviolet Light Detection in Caenorhabditis elegans”: 9 × video/quicktime 10 × application/pdf 1 × application/msword

Checking MIME types … Traceback (most recent call last):## | File "./oa-get", line 112, in session.commit() File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/scoping.py", line 114, in do return getattr(self.registry(), name)(_args, *_kwargs) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 656, in commit self.transaction.commit() File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 314, in commit self._prepare_impl() File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 298, in _prepare_impl self.session.flush() File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1583, in flush self._flush(objects) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1654, in _flush flush_context.execute() File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 331, in execute rec.execute(self) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 475, in execute uow File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 59, in save_obj mapper, table, update) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 485, in _emit_update_statements execute(statement, params) File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1449, in execute params) File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1584, in _execute_clauseelement compiled_sql, distilled_params File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1698, in _execute_context context) File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1691, in _execute_context context) File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 331, in do_execute cursor.execute(statement, parameters) sqlalchemy.exc.OperationalError: (OperationalError) no such table: model_supplementarymaterial u'UPDATE model_supplementarymaterial SET mimetype_reported=?, mime_subtype_reported=? WHERE model_supplementarymaterial.url = ?' (u'application', u'pdf', u'http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2494560/bin/pbio.0060198.sg004.pdf') daniel@oami-host:~/open-access-media-importer$ git pull Already up-to-date. daniel@oami-host:~/open-access-media-importer$ echo 10.1371/journal.pbio.0060198 | ./oami_pmc_doi_import Removing “/home/daniel/.local/share/open-access-media-importer/pmc_doi.sqlite” … done. Input DOIs, delimited by whitespace: Getting PubMed Central IDs for given DOIs … found: 2494560 Downloading “http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pmc&id=2494560”, saving into directory “/home/daniel/.cache/open-access-media-importer/metadata/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| /usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py:463: SAWarning: Unicode type received non-unicode bind param value. param.append(processorskey) “A Novel Molecular Solution for Ultraviolet Light Detection in Caenorhabditis elegans”: 9 × video/quicktime 10 × application/pdf 1 × application/msword

Checking MIME types … Traceback (most recent call last):####################################################################################################### | File "./oa-get", line 58, in url = material.url File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/attributes.py", line 168, in get return self.impl.get(instancestate(instance),dict) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/attributes.py", line 451, in get value = callable_(passive) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/state.py", line 285, in call self.manager.deferred_scalar_loader(self, toload) File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/mapper.py", line 1714, in _load_scalar_attributes raise orm_exc.ObjectDeletedError(state) sqlalchemy.orm.exc.ObjectDeletedError: Instance '<SupplementaryMaterial at 0x2434450>' has been deleted, or its row is otherwise not present. daniel@oami-host:~/open-access-media-importer$ echo 10.1371/journal.pgen.1000972 | ./oami_pmc_doi_import Removing “/home/daniel/.local/share/open-access-media-importer/pmc_doi.sqlite” … done. Input DOIs, delimited by whitespace: Getting PubMed Central IDs for given DOIs … found: 2877737 Downloading “http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pmc&id=2877737”, saving into directory “/home/daniel/.cache/open-access-media-importer/metadata/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| /usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py:463: SAWarning: Unicode type received non-unicode bind param value. param.append(processorskey) “Manipulation of Behavioral Decline in Caenorhabditis elegans with the Rag GTPase raga-1”: 7 × video/x-msvideo 8 × image/tiff 6 × application/msword

Checking MIME types … 100% |##########################################################################################################################################################################################################| Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877737/bin/pgen.1000972.s015.avi. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877737/bin/pgen.1000972.s016.avi. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877737/bin/pgen.1000972.s017.avi. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877737/bin/pgen.1000972.s018.avi. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877737/bin/pgen.1000972.s019.avi. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877737/bin/pgen.1000972.s020.avi. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877737/bin/pgen.1000972.s021.avi. Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s014.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s015.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s016.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s017.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s018.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s019.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s020.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s021.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s022.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s023.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s024.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s025.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s026.mov. Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s027.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s028.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s029.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s030.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s031.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2991264/bin/pgen.1001219.s032.mov, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |##########################################################################################################################################################################################################| Converting “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pgen.1001219.s014.mov”, saving into “/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/pgen.1001219.s014.mov.ogv” … 15% |############################## Traceback (most recent call last):############################################################################################################################################# | File "./oa-cache", line 134, in f = mutagen.oggtheora.OggTheora(temporary_media_path) File "/usr/lib/python2.7/dist-packages/mutagen/init.py", line 73, in init self.load(filename, _args, *_kwargs) File "/usr/lib/python2.7/dist-packages/mutagen/ogg.py", line 438, in load fileobj = file(filename, "rb") IOError: [Errno 2] No such file or directory: '/home/daniel/.cache/open-access-media-importer/media/refined/pmc_doi/current.ogv' daniel@oami-host:~/open-access-media-importer$

Daniel-Mietchen commented 11 years ago

Especially alarming are lines like Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s009.mov. Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2841623/bin/pgen.1000884.s005.avi, saving into directory “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi” …

This means that workflows for different DOIs are being mixed up.

Daniel-Mietchen commented 11 years ago

Just did git reset --hard HEAD^ in order to resume import.

Daniel-Mietchen commented 11 years ago

10.1186/1476-7120-7-44 still stalls, while for some other DOIs, the bug seems partially fixed with https://github.com/erlehmann/open-access-media-importer/commit/d1c41871f3a8604f9efa1ec0950e812b5ecd79c3 but the abortion in "Conversion of </home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1472-6807-7-71-S14.mpg> stalled, aborting … " should apply just to the file whose conversion stalled, not to the entire set of files from the article - the others should still be uploaded.

Daniel-Mietchen commented 11 years ago

Reopening, as per https://github.com/erlehmann/open-access-media-importer/issues/18#issuecomment-10561289

Daniel-Mietchen commented 11 years ago

Perhaps 10s of no interaction is too short? Does 10.1186/1471-2121-6-16-S2.mpg eventually convert if performed without the time restriction?

erlehmann commented 11 years ago

On a test run with timeout increased to 60 seconds 1471-2121-6-16-S2.mpg seems to run through. Proof at https://species-id.net/wiki/File:The-C-terminal-subunit-of-artificially-truncated-human-cathepsin-B-mediates-its-nuclear-targeting-1471-2121-6-16-S2.ogv.

erlehmann commented 11 years ago

Media conversion timeout increased to 60s, see 4e374b5108fb70422b0d8cb4bb68175f6adf956e.

Daniel-Mietchen commented 11 years ago

Now the file went through: http://commons.wikimedia.org/wiki/File:The-C-terminal-subunit-of-artificially-truncated-human-cathepsin-B-mediates-its-nuclear-targeting-1471-2121-6-16-S2.ogv But it does not play for me.

More importantly, a number of the DOIs listed in https://github.com/erlehmann/open-access-media-importer/issues/18#issuecomment-10434886 still stall (e.g. 10.1371/journal.pcbi.1001040), and of those that don't, the abort kicks out all files of the article, not just the ones that don't convert properly.

erlehmann commented 11 years ago
1007 ~ % file The-C-terminal-subunit-of-artificially-truncated-human-cathepsin-B-mediates-its-nuclear-targeting-1471-2121-6-16-S2.ogv
The-C-terminal-subunit-of-artificially-truncated-human-cathepsin-B-mediates-its-nuclear-targeting-1471-2121-6-16-S2.ogv: Ogg data, Theora video
1008 ~ % oggz-info The-C-terminal-subunit-of-artificially-truncated-human-cathepsin-B-mediates-its-nuclear-targeting-1471-2121-6-16-S2.ogv
Content-Duration: 00:00:00.000
Theora: serialno 1635537489
    1123 packets in 439 pages, 2.6 packets/page, 0.571% Ogg overhead
    Theora-Version: 3.2.1
    Video-Framerate: 25.000 fps
    Video-Width: 512
    Video-Height: 512
1010 ~ % oggz-validate The-C-terminal-subunit-of-artificially-truncated-human-cathepsin-B-mediates-its-nuclear-targeting-1471-2121-6-16-S2.ogv
The-C-terminal-subunit-of-artificially-truncated-human-cathepsin-B-mediates-its-nuclear-targeting-1471-2121-6-16-S2.ogv: Error:
00:00:00.000: serialno 1635537489: Packet out of order (previous 00:00:00.080)
00:00:00.000: serialno 1635537489: Packet out of order (previous 00:00:00.520)
00:00:00.000: serialno 1635537489: Granulepos decreasing within track
00:00:00.000: serialno 1635537489: Packet out of order (previous 00:00:01.560)
00:00:00.000: serialno 1635537489: Packet out of order (previous 00:00:02.080)
00:00:00.000: serialno 1635537489: Granulepos decreasing within track
00:00:00.000: serialno 1635537489: Packet out of order (previous 00:00:03.120)
00:00:00.000: serialno 1635537489: Packet out of order (previous 00:00:03.640)
00:00:00.000: serialno 1635537489: Granulepos decreasing within track
00:00:00.000: serialno 1635537489: Packet out of order (previous 00:00:04.680)
00:00:00.000: serialno 1635537489: Packet out of order (previous 00:00:05.200)
00:00:00.000: serialno 1635537489: Granulepos decreasing within track
oggz-validate --max-errors 10: maximum error count reached, bailing out ...
Daniel-Mietchen commented 11 years ago

The file at http://commons.wikimedia.org/wiki/File:The-C-terminal-subunit-of-artificially-truncated-human-cathepsin-B-mediates-its-nuclear-targeting-1471-2121-6-16-S2.ogv now plays fine for me too.

The aborts still kick out the whole paper when one conversion fails - this should be changed to just skipping those conversions (and uploads) that are actually affected.

erlehmann commented 11 years ago

With 10.1186/1476-7120-7-44, conversion stalls at the following lines in the media helper conversion function:

        pipeline.set_state(gst.STATE_PLAYING)
        pipeline.get_state()
erlehmann commented 11 years ago

I think there is something subtle wrong with my encoding pipeline. I will proceed to build a testcase for encoding only.

Daniel-Mietchen commented 11 years ago

Some of the PMCIDs for which one conversion stalls and leads to the abortion of all files from the article:

2727657 3498229 3499580 3356593 3500284

Daniel-Mietchen commented 11 years ago

Just ran over the DOIs listed in https://github.com/erlehmann/open-access-media-importer/issues/18#issuecomment-10434886 . We now have a list of the files that stall, and the other files of those articles have all been downloaded and converted. However, only the first few have actually been uploaded.

A log (I canceled at the end, after nothing had happened for a few minutes): daniel@oami-host:~/open-access-media-importer$ echo 10.1371/journal.pone.0042114 | ./oami_pmc_doi_import Input DOIs, delimited by whitespace: Getting PubMed Central IDs for given DOIs … found: 3407134 Downloading “http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pmc&id=3407134”, saving into directory “/home/daniel/.cache/open-access-media-importer/metadata/raw/pmc_doi” … 100% |#############################################################################################################################################################################################| Skipping 56 records … Checking MIME types … No materials found where MIME type has to be checked. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S1.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S2.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S3.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S4.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S5.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S6.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S7.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S8.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S9.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S10.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S11.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S12.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S13.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S14.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S15.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S16.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S17.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S18.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S19.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S20.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S21.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S22.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S23.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S24.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S25.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S27.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S28.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S26.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S29.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S30.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S31.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S32.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S33.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S34.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S35.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S36.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S37.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S38.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S39.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S41.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S42.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S43.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S44.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S45.avi, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516786/bin/1476-7120-2-15-S46.avi, already exists at http://commons.wikimedia.org/w/api.php. Unknown, possibly non-free license: http://www.nature.com/authors/editorial_policies/license.html#terms Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1950908/bin/pone.0000794.s002.mov, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2948515/bin/pone.0013124.s001.mov, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s004.mov, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s005.mov, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s006.mov, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s007.mov, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s008.mov, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3202553/bin/pone.0026598.s009.mov, already exists at http://commons.wikimedia.org/w/api.php. Skipping http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3498212/bin/pone.0049600.s001.mov, already exists at http://commons.wikimedia.org/w/api.php. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0049240.s024.mp4”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1471-2202-10-22-S1.mpg”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1471-2202-11-110-S1.MOV”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1471-2377-10-37-S2.MPG”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1472-6807-7-71-S14.mpg”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1746-4358-4-4-S2.mpg”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1476-7120-9-8-S1.AVI”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1476-7120-9-8-S2.AVI”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1476-7120-9-8-S3.AVI”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1749-8104-2-15-S8.mov”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1749-8104-2-15-S9.mov”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1749-7221-5-7-S2.mpeg”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1750-1326-5-41-S9.WMV”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1756-3305-5-179-S1.wmv”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/1757-1626-2-193-S1.wmv”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/bcr2358-S1.QT”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/cc6118-S4.avi”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pbio.0060198.sv002.mov”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pcbi.1001040.s002.mpg”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pgen.1000484.s007.mov”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0000794.s003.avi”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0004536.s001.avi”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0004536.s002.avi”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0004536.s003.avi”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0004938.s001.mov”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0004960.s002.mpg”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0008511.s003.wmv”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0016734.s002.mov”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0017314.s001.mpg”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0019733.s002.mpg”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0019812.s001.avi”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0019812.s002.avi”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0019812.s003.avi”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0019812.s004.avi”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0018056.s003.wmv”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0020485.s003.mov”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0023808.s010.mp4”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0030182.s003.wmv”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0032554.s013.mp4”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0038027.s007.mov”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0038027.s008.mov”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0038228.s004.mpg”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0042114.s004.avi”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0042698.s005.wmv”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/pone.0047486.s004.wmv”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/ppat.1002280.s008.mov”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/ppat.1002280.s009.mov”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/ppat.1002280.s010.mov”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/ppat.1002280.s011.mov”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/ppat.1002280.s012.mov”, earlier attempt failed. Skipping conversion of “/home/daniel/.cache/open-access-media-importer/media/raw/pmc_doi/ppat.1002280.s013.mov”, earlier attempt failed. ^CTraceback (most recent call last): File "./oa-put", line 110, in mediawiki.upload(media_refined_path, wiki_filename, page_template) File "/home/daniel/open-access-media-importer/helpers/mediawiki.py", line 47, in upload comment = 'Automatically uploaded media file from [[:en:Open access|Open Access]] source. Please report problems or suggestions [[User talk:Open Access Media Importer Bot|here]].' File "/home/daniel/open-access-media-importer/helpers/wikitools/wikifile.py", line 229, in upload res = req.query() File "/home/daniel/open-access-media-importer/helpers/wikitools/api.py", line 139, in query rawdata = self.getRaw() File "/home/daniel/open-access-media-importer/helpers/wikitools/api.py", line 214, in getRaw data = self.opener.open(self.request) File "/usr/lib/python2.7/urllib2.py", line 400, in open response = self._open(req, data) File "/usr/lib/python2.7/urllib2.py", line 418, in _open '_open', req) File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain result = func(_args) File "/usr/lib/python2.7/urllib2.py", line 1207, in http_open return self.do_open(httplib.HTTPConnection, req) File "/usr/lib/python2.7/urllib2.py", line 1174, in do_open h.request(req.get_method(), req.get_selector(), req.data, headers) File "/usr/lib/python2.7/httplib.py", line 958, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 992, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 954, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 814, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 790, in send self.sock.sendall(data) File "/usr/lib/python2.7/socket.py", line 224, in meth return getattr(self._sock,name)(_args) KeyboardInterrupt

Daniel-Mietchen commented 11 years ago

Another example: all videos in 10.1371/journal.ppat.1002280 stall at the very beginning of the conversion.

Daniel-Mietchen commented 10 years ago

Another one: 10.1371/journal.pone.0079600

Daniel-Mietchen commented 10 years ago

Also affected: 10.1371/journal.pcbi.1003347

Daniel-Mietchen commented 10 years ago

10.1371/journal.pone.0083171

Daniel-Mietchen commented 10 years ago

10.1371/journal.pone.0085957

Daniel-Mietchen commented 10 years ago

10.1371/journal.pone.0086193

Daniel-Mietchen commented 10 years ago

10.1371/journal.pone.0084809

Daniel-Mietchen commented 10 years ago

10.1186/1743-0003-10-111

Daniel-Mietchen commented 10 years ago

10.1371/journal.pone.0089961

Daniel-Mietchen commented 10 years ago

10.1371/journal.pone.0097559

Daniel-Mietchen commented 9 years ago

10.3389/fgene.2014.00202

Daniel-Mietchen commented 9 years ago

10.1371/journal.pone.0103460

Daniel-Mietchen commented 9 years ago

10.3389/fbioe.2014.00037

Daniel-Mietchen commented 9 years ago

10.1371/journal.pone.0109007

Daniel-Mietchen commented 9 years ago

10.1371/journal.pone.0110710

Daniel-Mietchen commented 9 years ago

10.1371/journal.pone.0107271

Daniel-Mietchen commented 9 years ago

10.1371/journal.pone.0111787

erlehmann commented 9 years ago

File from 10.1371/journal.pone.0111787 that stalls both oami-converter-test.py and totem: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4216124/bin/pone.0111787.s002.mpg

erlehmann commented 9 years ago

Bug reported at https://bugzilla.gnome.org/show_bug.cgi?id=740102 for file from 10.1371/journal.pone.0111787.

erlehmann commented 9 years ago

Bug reported at https://bugzilla.gnome.org/show_bug.cgi?id=740103 for file from 10.1371/journal.pone.0107271.

Daniel-Mietchen commented 9 years ago

10.1371/journal.pone.0111467

Daniel-Mietchen commented 9 years ago

10.1371/journal.pone.0109926

erlehmann commented 9 years ago

GStreamer developers cannot reproduce the bugs with GStreamer 1.4 – OAMI currently uses GStreamer 0.10.36, which is no longer maintained: https://bugzilla.gnome.org/show_bug.cgi?id=740103 https://bugzilla.gnome.org/show_bug.cgi?id=740102

erlehmann commented 9 years ago

Unfortunately, the Python interface for GStreamer from 1.0 is different than the one for GStreamer 0.10. Calling GStreamer from Python is now done via GObject introspection instead of static bindings. The breaking changes are generally subtle: pipeline.set_state(gst.STATE_PLAYING) is now pipeline.set_state(Gst.State.PLAYING). https://wiki.ubuntu.com/Novacut/GStreamer1.0 http://cgit.freedesktop.org/gstreamer/gstreamer/tree/docs/random/porting-to-1.0.txt

erlehmann commented 9 years ago

I was able to port this GStreamer 1.0 transcoder example https://bazaar.launchpad.net/~jderose/+junk/gst-examples/view/head:/transcoder-1.0 to Python 2. Need to find out how it is licensed.

erlehmann commented 9 years ago

I have sent an email to Jason Gerard DeRose asking under which license I could use his code.

erlehmann commented 9 years ago

Commit 1ec2cd50dd6fa066ead7393a731ec1985029dc03 contains an updated transcoding process. Note that the code is on the wmde-review branch; it will be merged to master as soon as Daniel Kinzler approves it.