wpoa / open-access-media-importer

A tool for harvesting media files from Open Access articles for upload into Wikimedia Commons
http://commons.wikimedia.org/wiki/User:Open_Access_Media_Importer_Bot
23 stars 8 forks source link

File not uploaded #22

Open Daniel-Mietchen opened 12 years ago

Daniel-Mietchen commented 12 years ago

Affected DOIs: 10.1371/journal.pone.0006573 10.1371/journal.pone.0032965

Daniel-Mietchen commented 11 years ago

Timeout for this one during upload as well: 10.1371/journal.pone.0054076

Daniel-Mietchen commented 11 years ago

Another example: 10.7554/eLife.00411 (189 MB)

RaphaelWimmer commented 11 years ago

Apparently Wikimedia Commons does not allow uploads > 100 MB and requires larger files to be split into smaller chunks [1]. AFAIK, OAMI does not yet support this upload method. For an example implementation see also upload() method of Wiki.java [2]:

[1] https://www.mediawiki.org/wiki/API:Upload#Chunked_uploading [2] http://code.google.com/p/wiki-java/source/browse/trunk/src/org/wikipedia/Wiki.java#3083

Daniel-Mietchen commented 11 years ago

Another example: For http://dx.doi.org/10.2196/jmir.1911 , only the first 4 appendices (which are under 100MB) have been uploaded: http://commons.wikimedia.org/wiki/File:Online-Social-Networks-and-Smoking-Cessation-A-Scientific-Research-Agenda-jmir_v13i4e119_app1.ogv http://commons.wikimedia.org/wiki/File:Online-Social-Networks-and-Smoking-Cessation-A-Scientific-Research-Agenda-jmir_v13i4e119_app2.ogv http://commons.wikimedia.org/wiki/File:Online-Social-Networks-and-Smoking-Cessation-A-Scientific-Research-Agenda-jmir_v13i4e119_app3.ogv http://commons.wikimedia.org/wiki/File:Online-Social-Networks-and-Smoking-Cessation-A-Scientific-Research-Agenda-jmir_v13i4e119_app4.ogv

Appendices 5-13 are above 100MB and have not been uploaded.

Daniel-Mietchen commented 11 years ago

Another example implementation is at https://commons.wikimedia.org/wiki/User:Smallman12q/PyCJWiki (as seen at http://lists.wikimedia.org/pipermail/wikitech-l/2013-June/070122.html ).

Daniel-Mietchen commented 11 years ago

There is a known bug related to that: https://bugzilla.wikimedia.org/show_bug.cgi?id=36587 .

Daniel-Mietchen commented 11 years ago

Just had the following case, in which only one audio file out of 7 seems to have actually been uploaded: http://commons.wikimedia.org/wiki/File:Speech-vs.-singing-infants-choose-happier-sounds-Audio7.ogv .

As far as I can see, there was neither a conversion nor an upload error, nor are the other files already present on Commons. Could this be another instance of https://github.com/erlehmann/open-access-media-importer/issues/83 ?


Di 2. Jul 23:50:40 CEST 2013 doi: 10.3389/fpsyg.2013.00372 Removing “/home/danielmietchen/.local/share/open-access-media-importer/pmc_doi.sqlite” … done. Input DOIs, delimited by whitespace: Getting PubMed Central IDs for given DOIs … found: 3693090 Downloading “http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pmc&id=3693090”, saving into directory “/home/danielmietchen/.cache/open-access-media-importer/metadata/raw/pmc_doi” … 100% |#################################################################################################################################| /usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py:463: SAWarning: Unicode type received non-unicode bind param value. param.append(processorskey) “Speech vs. singing: infants choose happier sounds”: 7 × audio/basic

Checking MIME types … DOI 10.3389/fpsyg.2013.00372, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693090/bin/Audio1.WAV, source claimed audio/basic but is audio/x-wav. /usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py:463: SAWarning: Unicode type received non-unicode bind param value. param.append(processorskey) DOI 10.3389/fpsyg.2013.00372, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693090/bin/Audio2.WAV, source claimed audio/basic but is audio/x-wav. DOI 10.3389/fpsyg.2013.00372, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693090/bin/Audio3.WAV, source claimed audio/basic but is audio/x-wav. DOI 10.3389/fpsyg.2013.00372, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693090/bin/Audio4.WAV, source claimed audio/basic but is audio/x-wav. DOI 10.3389/fpsyg.2013.00372, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693090/bin/Audio5.WAV, source claimed audio/basic but is audio/x-wav. DOI 10.3389/fpsyg.2013.00372, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693090/bin/Audio6.WAV, source claimed audio/basic but is audio/x-wav. 7 of 7 100% |###########################################################################################################| Time: 00:00:14 DOI 10.3389/fpsyg.2013.00372, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693090/bin/Audio7.WAV, source claimed audio/basic but is audio/x-wav. Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693090/bin/Audio1.WAV, saving into directory “/home/danielmietchen/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |#################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693090/bin/Audio2.WAV, saving into directory “/home/danielmietchen/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |#################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693090/bin/Audio3.WAV, saving into directory “/home/danielmietchen/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |#################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693090/bin/Audio4.WAV, saving into directory “/home/danielmietchen/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |#################################################################################################################################| Downloading http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693090/bin/Audio5.WAV, saving into directory “/home/danielmietchen/.cache/open-access-media-importer/media/raw/pmc_doi” … 100% |#################################################################################################################################| Skipping download of http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693090/bin/Audio6.WAV. Skipping download of http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693090/bin/Audio7.WAV. Skipping conversion of “/home/danielmietchen/.cache/open-access-media-importer/media/raw/pmc_doi/Audio1.WAV”, exists at “/home/danielmietchen/.cache/open-access-media-importer/media/refined/pmc_doi/Audio1.WAV.ogv”. Skipping conversion of “/home/danielmietchen/.cache/open-access-media-importer/media/raw/pmc_doi/Audio2.WAV”, exists at “/home/danielmietchen/.cache/open-access-media-importer/media/refined/pmc_doi/Audio2.WAV.ogv”. Skipping conversion of “/home/danielmietchen/.cache/open-access-media-importer/media/raw/pmc_doi/Audio3.WAV”, exists at “/home/danielmietchen/.cache/open-access-media-importer/media/refined/pmc_doi/Audio3.WAV.ogv”. Skipping conversion of “/home/danielmietchen/.cache/open-access-media-importer/media/raw/pmc_doi/Audio4.WAV”, exists at “/home/danielmietchen/.cache/open-access-media-importer/media/refined/pmc_doi/Audio4.WAV.ogv”. Skipping conversion of “/home/danielmietchen/.cache/open-access-media-importer/media/raw/pmc_doi/Audio5.WAV”, exists at “/home/danielmietchen/.cache/open-access-media-importer/media/refined/pmc_doi/Audio5.WAV.ogv”. Skipping conversion of “/home/danielmietchen/.cache/open-access-media-importer/media/raw/pmc_doi/Audio6.WAV”, exists at “/home/danielmietchen/.cache/open-access-media-importer/media/refined/pmc_doi/Audio6.WAV.ogv”. Skipping conversion of “/home/danielmietchen/.cache/open-access-media-importer/media/raw/pmc_doi/Audio7.WAV”, exists at “/home/danielmietchen/.cache/open-access-media-importer/media/refined/pmc_doi/Audio7.WAV.ogv”. Authenticating with http://commons.wikimedia.org/w/api.php. “/home/danielmietchen/.cache/open-access-media-importer/media/refined/pmc_doi/Audio1.WAV.ogv” uploaded to http://commons.wikimedia.org/w/api.php. Authenticating with http://commons.wikimedia.org/w/api.php. “/home/danielmietchen/.cache/open-access-media-importer/media/refined/pmc_doi/Audio2.WAV.ogv” uploaded to http://commons.wikimedia.org/w/api.php. Authenticating with http://commons.wikimedia.org/w/api.php. “/home/danielmietchen/.cache/open-access-media-importer/media/refined/pmc_doi/Audio3.WAV.ogv” uploaded to http://commons.wikimedia.org/w/api.php. Authenticating with http://commons.wikimedia.org/w/api.php. “/home/danielmietchen/.cache/open-access-media-importer/media/refined/pmc_doi/Audio4.WAV.ogv” uploaded to http://commons.wikimedia.org/w/api.php. Authenticating with http://commons.wikimedia.org/w/api.php. “/home/danielmietchen/.cache/open-access-media-importer/media/refined/pmc_doi/Audio5.WAV.ogv” uploaded to http://commons.wikimedia.org/w/api.php. Authenticating with http://commons.wikimedia.org/w/api.php. “/home/danielmietchen/.cache/open-access-media-importer/media/refined/pmc_doi/Audio6.WAV.ogv” uploaded to http://commons.wikimedia.org/w/api.php. Authenticating with http://commons.wikimedia.org/w/api.php. Throttled “/home/danielmietchen/.cache/open-access-media-importer/media/refined/pmc_doi/Audio7.WAV.ogv” uploaded to http://commons.wikimedia.org/w/api.php.

Daniel-Mietchen commented 11 years ago

Another example: Movie 1 of 10.7554/eLife.00632 (size after conversion: 106155907 )

Daniel-Mietchen commented 11 years ago

Another one: 10.1371/journal.pone.0074583 (size: 217958641)

Daniel-Mietchen commented 10 years ago

10.1186/1743-0003-10-111: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3907014/bin/1743-0003-10-111-S2.mp4 (size: 161534317)

Daniel-Mietchen commented 10 years ago

Movies S5 and S6 of 10.1371/journal.pone.0095113

Daniel-Mietchen commented 10 years ago

Movie 1 of 10.1186/2193-1801-3-306

Daniel-Mietchen commented 10 years ago

10.1371/journal.pone.0107626

Daniel-Mietchen commented 10 years ago

10.1186/1475-925X-13-140

Daniel-Mietchen commented 9 years ago

10.1371/journal.pcbi.1003935

Daniel-Mietchen commented 9 years ago

10.7554/eLife.00632

Daniel-Mietchen commented 9 years ago

10.1371/journal.pone.0115889