Open mialondon opened 4 years ago
Some thoughts on uncovering the error: @christianalgar can you note the time and dates when you've tried to add volumes recently that led to the error? We might be able to match them to error messages with details in the traceback calls sent via email.
For example, the inbox has errors from Tuesday around 1:30, 2:30 and 5:30pm.
This doc has a summary of dates when particular volumes / tasks were attempted to be added: ITS review.xlsx
Some emails received from attempts to add tasks / projects copied below (with times). No email received for the majority of tasks added that failed - this might be because I was deleting the volumes almost immediately to prevent any users experiencing the confetti task.
We would need to add tasks that show the confetti bug and leave them there to receive a notice, I expect?
LibCrowds Support support@libcrowds.com To:
Mon, Jun 15 at 8:25 AM
Hello,
126 new tasks were imported successfully to your project Mark Titles: A collection of playbills from Theatre, Scarborough 1784-1846.!
All the best, The LibCrowds team.
Mon, Jun 15 at 2:26 PM
Hello,
42 new tasks were imported successfully to your project Transcribe Titles: Windsor Castle 1849-1861!
All the best, The LibCrowds team.
Mon, Jun 15 at 2:45 PM
Hello,
554 new tasks were imported successfully to your project Transcribe Dates: Miscellaneous Birmingham theatres 1774-1800!
All the best, The LibCrowds team.
Mon, Jun 15 at 3:03 PM
Hello,
It looks like there were no new records to import to your project Transcribe Genres: Theatre Royal, Bristol 1819-1823 (Vol. 2)!
All the best, The LibCrowds team.
Mon, Jun 15 at 4:52 PM
Hello,
368 new tasks were imported successfully to your project Mark Titles: A collection of playbills from Theatre Royal, Liverpool 1820-1822 (Vol. 1)!
All the best, The LibCrowds team.
Mon, Jun 15 at 5:03 PM
Hello,
281 new tasks were imported successfully to your project Mark Titles: Covent Garden Theatre 1753-1779!
All the best, The LibCrowds team.
Mon, Jun 15 at 5:32 PM
Hello,
300 new tasks were imported successfully to your project Transcribe Dates: A collection of playbills from Theatre, Drayton 1795-1844 (Vol. 1)!
All the best, The LibCrowds team.
Tue, Jun 16 at 2:33 PM
Hello,
316 new tasks were imported successfully to your project Transcribe Dates: A collection of playbills from miscellaneous theatres: Huddersfield - Ledbury 1783-1864 (Vol. 2)!
All the best, The LibCrowds team.
I've looked up the traceback errors for the first three attempts. They look pretty useful so I can do the rest if it'd help @harryjmoss
Mon, Jun 15 at 8:25 AM 126 new tasks were imported successfully to your project Mark Titles: A collection of playbills from Theatre, Scarborough 1784-1846.!
af17cb36-be7d-4e16-8446-18724062782d has failed more than 3 times [arrived 08:34] Please, review the background jobs of your server. This is the trace error
Traceback (most recent call last): File "/var/www/pybossa/env/local/lib/python2.7/site-packages/rq/worker.py", line 479, in perform_job rv = job.perform() File "/var/www/pybossa/env/local/lib/python2.7/site-packages/rq/job.py", line 466, in perform self._result = self.func(*self.args, self.kwargs) File "/var/www/pybossa/pybossa/plugins/pybossa_lc/jobs.py", line 65, in import_tasks_with_redundancy import_tasks(project_id, import_data) File "/var/www/pybossa/pybossa/jobs.py", line 519, in import_tasks report = importer.create_tasks(task_repo, project_id, **form_data) File "/var/www/pybossa/pybossa/importers/importer.py", line 68, in create_tasks for task_data in importer.tasks(): File "/var/www/pybossa/pybossa/importers/iiif.py", line 38, in tasks return self._generate_tasks() File "/var/www/pybossa/pybossa/plugins/pybossa_lc/importers/iiif_enhanced.py", line 28, in _generate_tasks child_task_data = self._get_child_task_data(task_data, self.parent_id) File "/var/www/pybossa/pybossa/plugins/pybossa_lc/importers/iiif_enhanced.py", line 48, in _get_child_task_data raise BulkImportException(err_msg) BulkImportException: A parent annotation has an invalid target
Mon, Jun 15 at 2:26 PM 42 new tasks were imported successfully to your project Transcribe Titles: Windsor Castle 1849-1861!
33653ad0-8729-40f7-8946-68e7bcda79e5 has failed more than 3 times [arrived 14:33]
Please, review the background jobs of your server. This is the trace error
Traceback (most recent call last): File "/var/www/pybossa/env/local/lib/python2.7/site-packages/rq/worker.py", line 479, in perform_job rv = job.perform() File "/var/www/pybossa/env/local/lib/python2.7/site-packages/rq/job.py", line 466, in perform self._result = self.func(*self.args, self.kwargs) File "/var/www/pybossa/pybossa/plugins/pybossa_lc/jobs.py", line 65, in import_tasks_with_redundancy import_tasks(project_id, import_data) File "/var/www/pybossa/pybossa/jobs.py", line 519, in import_tasks report = importer.create_tasks(task_repo, project_id, **form_data) File "/var/www/pybossa/pybossa/importers/importer.py", line 68, in create_tasks for task_data in importer.tasks(): File "/var/www/pybossa/pybossa/importers/iiif.py", line 38, in tasks return self._generate_tasks() File "/var/www/pybossa/pybossa/plugins/pybossa_lc/importers/iiif_enhanced.py", line 28, in _generate_tasks child_task_data = self._get_child_task_data(task_data, self.parent_id) File "/var/www/pybossa/pybossa/plugins/pybossa_lc/importers/iiif_enhanced.py", line 48, in _get_child_task_data raise BulkImportException(err_msg) BulkImportException: A parent annotation has an invalid target
Mon, Jun 15 at 2:45 PM 554 new tasks were imported successfully to your project Transcribe Dates: Miscellaneous Birmingham theatres 1774-1800!
de42bf2f-e47e-4f4a-955f-83d746149490 has failed more than 3 times [arrived 14:34]
Traceback (most recent call last): File "/var/www/pybossa/env/local/lib/python2.7/site-packages/rq/worker.py", line 479, in perform_job rv = job.perform() File "/var/www/pybossa/env/local/lib/python2.7/site-packages/rq/job.py", line 466, in perform self._result = self.func(*self.args, self.kwargs) File "/var/www/pybossa/pybossa/plugins/pybossa_lc/jobs.py", line 65, in import_tasks_with_redundancy import_tasks(project_id, import_data) File "/var/www/pybossa/pybossa/jobs.py", line 519, in import_tasks report = importer.create_tasks(task_repo, project_id, **form_data) File "/var/www/pybossa/pybossa/importers/importer.py", line 73, in create_tasks task_repo.save(task) File "/var/www/pybossa/pybossa/repositories/task_repository.py", line 107, in save raise DBIntegrityError(e) DBIntegrityError: (psycopg2.errors.ForeignKeyViolation) insert or update on table "task" violates foreign key constraint "task_project_id_fkey" DETAIL: Key (project_id)=(242) is not present in table "project".
[SQL: INSERT INTO task (created, project_id, state, quorum, calibration, priority_0, info, n_answers, fav_user_ids) VALUES (%(created)s, %(project_id)s, %(state)s, %(quorum)s, %(calibration)s, %(priority_0)s, %(info)s, %(n_answers)s, %(fav_user_ids)s) RETURNING task.id] [parameters: {'info': '{"tileSource": "https://api.bl.uk/image/iiif/ark:/81055/vdc_100022589089.0x000107/info.json", "url_m": "https://api.bl.uk/image/iiif/ark:/81055/vdc_1 ... (533 characters truncated) ... dc_100022589089.0x000107/full/1024,/0/default.jpg", "manifest": "https://api.bl.uk/metadata/iiif/ark:/81055/vdc_100022589090.0x000002/manifest.json"}', 'fav_user_ids': None, 'n_answers': 30, 'quorum': 0, 'calibration': 0, 'created': '2020-06-15T13:33:31.681528', 'state': u'ongoing', 'project_id': 242, 'priority_0': 0}] (Background on this error at: http://sqlalche.me/e/gkpj)
More notes from Christian from email about his spreadsheet:
The attached spreadsheet shows volumes on ITS with status for each project (complete; failed [plus date]; added [plus date]).
I have colour-coded vols in green to indicate a successful action / new project and red to indicate an unsuccessful action.
I have tried using the “Analyse empty results” function on Pybossa which seems to have had no effect BUT, part completed volumes have reappeared: (maybe because of the analyse empty results being triggered?) – two projects reappearing are:
I tried to update (reload) a manifest for a volume from which no projects had been done: Birmingham theatres 1801-1805 (Vol. 1) but it still failed.
I have added two new volumes, but the project’s still bombed:
Curiously I have been offered a task for a volume in which the task with the volume has already been done:
Weirdly, I got one project to load successfully after I tinkered with the Task Scheduler – I changed it from ‘Default’ to ‘Depth First All’ – this (might) of re-jigged a project that at first threw up confetti but then worked after altering the Task Scheduler. Unfortunately, I could not replicate this success with another bombed task. I got momentarily excited that a fix or workaround had been chanced upon. Vol that loaded this way was, I think, either:
Have just added a fresh volume: A collection of playbills from Theatre Royal, Hull 1827-1830. Used manifest below https://api.bl.uk/metadata/iiif/ark:/81055/vdc_100022589160.0x000002/manifest.json?manifest Prepared a Mark Titles project - confetti bug occurs.
Email says it was successful: Tue, Jun 23 at 5:15 PM
Hello,
368 new tasks were imported successfully to your project Mark Titles: A collection of playbills from Theatre Royal, Hull 1827-1830!
All the best, The LibCrowds team.
Checked project task and works.
Added TRANSCRIBE DATES: A COLLECTION OF PLAYBILLS FROM THEATRE ROYAL, MANCHESTER 1793-1808 (VOL. 1)
No email to say successful - confetti bug occurs.
Added: Mark Titles: A collection of playbills from Theatre Royal, Manchester 1793-1808 (Vol. 2)
Email says was successful.
@christianalgar the Manchester Transcribe Dates task is also confetti-ing. I noticed that the project listing page says it has 0 tasks - I guess that's useful in terms of checking them quickly, and might also be diagnostic for @harryjmoss ?
The errors are also available in the 'Background tasks' screen on the site's backend menu.
@harryjmoss a sudden thought - could the sql errors be related to recent(ish) database changes made during the other work?
Cribbed in part from https://github.com/LibCrowds/libcrowds/issues/843 where we were originally dealing with the issue:
Summary: volumes added to the site via the project admin interface appear to save successfully but volunteers can't access tasks on them.
When adding new volumes via the admin interface, all seems to go well, and you'll get an email saying '[number] new tasks were imported successfully to your project [task type e.g. Transcribe Dates]: [volume name e.g. Miscellaneous theatres: Stroud - Tullamore 1788-1848 (Vol. 2)]!'.
Christian additionally noted that trends seem to be:
Steps to reproduce: [we might need to update this - @christianalgar does it match with your more recent experience?]
Expected results: a new volume should be available on the projects page https://www.libcrowds.com/collection/playbills/projects. When a volunteer follows the link, they should have access to the task.
Actual result: When a volunteer follows the link, they get the 'confetti' message, 'Hooray! You have completed all available tasks for this project. As we have more than one person to complete each task to ensure high quality results, we still need more contributions before the project is marked as complete, so please spread the word!'
Related:
[ ] original early diagnosis of the issue https://github.com/LibCrowds/libcrowds/issues/843#issuecomment-524083526 'What I can see from my analysis of the database is that some projects didn't have any tasks and it seems to correlate with a missing manifest in the parent volumes. I'm not sure how this actually happened, whether it is an omission during creation or the admin interface being slightly buggy and loosing some inputs when things are not done in a proper sequence.'
[ ] Update 'how to add a new volume' documentation https://github.com/LibCrowds/libcrowds/issues/850