LibCrowds / libcrowds

The frontend for the LibCrowds crowdsourcing platform
MIT License
32 stars 6 forks source link

Adding new tasks for a volume seems to succeed but the public interface throws confetti #862

Open mialondon opened 4 years ago

mialondon commented 4 years ago

Cribbed in part from https://github.com/LibCrowds/libcrowds/issues/843 where we were originally dealing with the issue:

Summary: volumes added to the site via the project admin interface appear to save successfully but volunteers can't access tasks on them.

When adding new volumes via the admin interface, all seems to go well, and you'll get an email saying '[number] new tasks were imported successfully to your project [task type e.g. Transcribe Dates]: [volume name e.g. Miscellaneous theatres: Stroud - Tullamore 1788-1848 (Vol. 2)]!'.

Christian additionally noted that trends seem to be:

Steps to reproduce: [we might need to update this - @christianalgar does it match with your more recent experience?]

  1. find a sample manifest to add (e.g. https://api.bl.uk/metadata/iiif/ark:/81055/vdc_100022588967.0x000002/manifest.json ) by searching for 'A collection of playbills' in the Catalogue box in the top right-hand corner of https://bl.uk
  2. add a new volume via https://www.libcrowds.com/admin/collection/playbills/volumes/new following the steps in https://github.com/LibCrowds/libcrowds/issues/850

Expected results: a new volume should be available on the projects page https://www.libcrowds.com/collection/playbills/projects. When a volunteer follows the link, they should have access to the task.

Actual result: When a volunteer follows the link, they get the 'confetti' message, 'Hooray! You have completed all available tasks for this project. As we have more than one person to complete each task to ensure high quality results, we still need more contributions before the project is marked as complete, so please spread the word!'

Related:

mialondon commented 4 years ago

Some thoughts on uncovering the error: @christianalgar can you note the time and dates when you've tried to add volumes recently that led to the error? We might be able to match them to error messages with details in the traceback calls sent via email.

For example, the inbox has errors from Tuesday around 1:30, 2:30 and 5:30pm.

christianalgar commented 4 years ago

This doc has a summary of dates when particular volumes / tasks were attempted to be added: ITS review.xlsx

Some emails received from attempts to add tasks / projects copied below (with times). No email received for the majority of tasks added that failed - this might be because I was deleting the volumes almost immediately to prevent any users experiencing the confetti task.

We would need to add tasks that show the confetti bug and leave them there to receive a notice, I expect?

LibCrowds Support support@libcrowds.com To:

Mon, Jun 15 at 8:25 AM

Hello,

126 new tasks were imported successfully to your project Mark Titles: A collection of playbills from Theatre, Scarborough 1784-1846.!

All the best, The LibCrowds team.

Mon, Jun 15 at 2:26 PM

Hello,

42 new tasks were imported successfully to your project Transcribe Titles: Windsor Castle 1849-1861!

All the best, The LibCrowds team.

Mon, Jun 15 at 2:45 PM

Hello,

554 new tasks were imported successfully to your project Transcribe Dates: Miscellaneous Birmingham theatres 1774-1800!

All the best, The LibCrowds team.

Mon, Jun 15 at 3:03 PM

Hello,

It looks like there were no new records to import to your project Transcribe Genres: Theatre Royal, Bristol 1819-1823 (Vol. 2)!

All the best, The LibCrowds team.

Mon, Jun 15 at 4:52 PM

Hello,

368 new tasks were imported successfully to your project Mark Titles: A collection of playbills from Theatre Royal, Liverpool 1820-1822 (Vol. 1)!

All the best, The LibCrowds team.

Mon, Jun 15 at 5:03 PM

Hello,

281 new tasks were imported successfully to your project Mark Titles: Covent Garden Theatre 1753-1779!

All the best, The LibCrowds team.

Mon, Jun 15 at 5:32 PM

Hello,

300 new tasks were imported successfully to your project Transcribe Dates: A collection of playbills from Theatre, Drayton 1795-1844 (Vol. 1)!

All the best, The LibCrowds team.

Tue, Jun 16 at 2:33 PM

Hello,

316 new tasks were imported successfully to your project Transcribe Dates: A collection of playbills from miscellaneous theatres: Huddersfield - Ledbury 1783-1864 (Vol. 2)!

All the best, The LibCrowds team.

mialondon commented 4 years ago

I've looked up the traceback errors for the first three attempts. They look pretty useful so I can do the rest if it'd help @harryjmoss

Mon, Jun 15 at 8:25 AM 126 new tasks were imported successfully to your project Mark Titles: A collection of playbills from Theatre, Scarborough 1784-1846.!

af17cb36-be7d-4e16-8446-18724062782d has failed more than 3 times [arrived 08:34] Please, review the background jobs of your server. This is the trace error


Traceback (most recent call last): File "/var/www/pybossa/env/local/lib/python2.7/site-packages/rq/worker.py", line 479, in perform_job rv = job.perform() File "/var/www/pybossa/env/local/lib/python2.7/site-packages/rq/job.py", line 466, in perform self._result = self.func(*self.args, self.kwargs) File "/var/www/pybossa/pybossa/plugins/pybossa_lc/jobs.py", line 65, in import_tasks_with_redundancy import_tasks(project_id, import_data) File "/var/www/pybossa/pybossa/jobs.py", line 519, in import_tasks report = importer.create_tasks(task_repo, project_id, **form_data) File "/var/www/pybossa/pybossa/importers/importer.py", line 68, in create_tasks for task_data in importer.tasks(): File "/var/www/pybossa/pybossa/importers/iiif.py", line 38, in tasks return self._generate_tasks() File "/var/www/pybossa/pybossa/plugins/pybossa_lc/importers/iiif_enhanced.py", line 28, in _generate_tasks child_task_data = self._get_child_task_data(task_data, self.parent_id) File "/var/www/pybossa/pybossa/plugins/pybossa_lc/importers/iiif_enhanced.py", line 48, in _get_child_task_data raise BulkImportException(err_msg) BulkImportException: A parent annotation has an invalid target

Mon, Jun 15 at 2:26 PM 42 new tasks were imported successfully to your project Transcribe Titles: Windsor Castle 1849-1861!

33653ad0-8729-40f7-8946-68e7bcda79e5 has failed more than 3 times [arrived 14:33]

Please, review the background jobs of your server. This is the trace error


Traceback (most recent call last): File "/var/www/pybossa/env/local/lib/python2.7/site-packages/rq/worker.py", line 479, in perform_job rv = job.perform() File "/var/www/pybossa/env/local/lib/python2.7/site-packages/rq/job.py", line 466, in perform self._result = self.func(*self.args, self.kwargs) File "/var/www/pybossa/pybossa/plugins/pybossa_lc/jobs.py", line 65, in import_tasks_with_redundancy import_tasks(project_id, import_data) File "/var/www/pybossa/pybossa/jobs.py", line 519, in import_tasks report = importer.create_tasks(task_repo, project_id, **form_data) File "/var/www/pybossa/pybossa/importers/importer.py", line 68, in create_tasks for task_data in importer.tasks(): File "/var/www/pybossa/pybossa/importers/iiif.py", line 38, in tasks return self._generate_tasks() File "/var/www/pybossa/pybossa/plugins/pybossa_lc/importers/iiif_enhanced.py", line 28, in _generate_tasks child_task_data = self._get_child_task_data(task_data, self.parent_id) File "/var/www/pybossa/pybossa/plugins/pybossa_lc/importers/iiif_enhanced.py", line 48, in _get_child_task_data raise BulkImportException(err_msg) BulkImportException: A parent annotation has an invalid target

Mon, Jun 15 at 2:45 PM 554 new tasks were imported successfully to your project Transcribe Dates: Miscellaneous Birmingham theatres 1774-1800!

de42bf2f-e47e-4f4a-955f-83d746149490 has failed more than 3 times [arrived 14:34]

Traceback (most recent call last): File "/var/www/pybossa/env/local/lib/python2.7/site-packages/rq/worker.py", line 479, in perform_job rv = job.perform() File "/var/www/pybossa/env/local/lib/python2.7/site-packages/rq/job.py", line 466, in perform self._result = self.func(*self.args, self.kwargs) File "/var/www/pybossa/pybossa/plugins/pybossa_lc/jobs.py", line 65, in import_tasks_with_redundancy import_tasks(project_id, import_data) File "/var/www/pybossa/pybossa/jobs.py", line 519, in import_tasks report = importer.create_tasks(task_repo, project_id, **form_data) File "/var/www/pybossa/pybossa/importers/importer.py", line 73, in create_tasks task_repo.save(task) File "/var/www/pybossa/pybossa/repositories/task_repository.py", line 107, in save raise DBIntegrityError(e) DBIntegrityError: (psycopg2.errors.ForeignKeyViolation) insert or update on table "task" violates foreign key constraint "task_project_id_fkey" DETAIL: Key (project_id)=(242) is not present in table "project".

[SQL: INSERT INTO task (created, project_id, state, quorum, calibration, priority_0, info, n_answers, fav_user_ids) VALUES (%(created)s, %(project_id)s, %(state)s, %(quorum)s, %(calibration)s, %(priority_0)s, %(info)s, %(n_answers)s, %(fav_user_ids)s) RETURNING task.id] [parameters: {'info': '{"tileSource": "https://api.bl.uk/image/iiif/ark:/81055/vdc_100022589089.0x000107/info.json", "url_m": "https://api.bl.uk/image/iiif/ark:/81055/vdc_1 ... (533 characters truncated) ... dc_100022589089.0x000107/full/1024,/0/default.jpg", "manifest": "https://api.bl.uk/metadata/iiif/ark:/81055/vdc_100022589090.0x000002/manifest.json"}', 'fav_user_ids': None, 'n_answers': 30, 'quorum': 0, 'calibration': 0, 'created': '2020-06-15T13:33:31.681528', 'state': u'ongoing', 'project_id': 242, 'priority_0': 0}] (Background on this error at: http://sqlalche.me/e/gkpj)

mialondon commented 4 years ago

More notes from Christian from email about his spreadsheet:

The attached spreadsheet shows volumes on ITS with status for each project (complete; failed [plus date]; added [plus date]).

I have colour-coded vols in green to indicate a successful action / new project and red to indicate an unsuccessful action.

I have tried using the “Analyse empty results” function on Pybossa which seems to have had no effect BUT, part completed volumes have reappeared: (maybe because of the analyse empty results being triggered?) – two projects reappearing are:

I tried to update (reload) a manifest for a volume from which no projects had been done: Birmingham theatres 1801-1805 (Vol. 1) but it still failed.

I have added two new volumes, but the project’s still bombed:

Curiously I have been offered a task for a volume in which the task with the volume has already been done:

Weirdly, I got one project to load successfully after I tinkered with the Task Scheduler – I changed it from ‘Default’ to ‘Depth First All’ – this (might) of re-jigged a project that at first threw up confetti but then worked after altering the Task Scheduler. Unfortunately, I could not replicate this success with another bombed task. I got momentarily excited that a fix or workaround had been chanced upon. Vol that loaded this way was, I think, either:

christianalgar commented 4 years ago
  1. Have just added a fresh volume: A collection of playbills from Theatre Royal, Hull 1827-1830. Used manifest below https://api.bl.uk/metadata/iiif/ark:/81055/vdc_100022589160.0x000002/manifest.json?manifest Prepared a Mark Titles project - confetti bug occurs.

  2. Email says it was successful: Tue, Jun 23 at 5:15 PM

Hello,

368 new tasks were imported successfully to your project Mark Titles: A collection of playbills from Theatre Royal, Hull 1827-1830!

All the best, The LibCrowds team.

  1. Checked project task and works.

  2. Added TRANSCRIBE DATES: A COLLECTION OF PLAYBILLS FROM THEATRE ROYAL, MANCHESTER 1793-1808 (VOL. 1)

  3. No email to say successful - confetti bug occurs.

  4. Added: Mark Titles: A collection of playbills from Theatre Royal, Manchester 1793-1808 (Vol. 2)

  5. Email says was successful.

mialondon commented 4 years ago

@christianalgar the Manchester Transcribe Dates task is also confetti-ing. I noticed that the project listing page says it has 0 tasks - I guess that's useful in terms of checking them quickly, and might also be diagnostic for @harryjmoss ?

Screenshot_2020-06-23 In the Spotlight

mialondon commented 4 years ago

The errors are also available in the 'Background tasks' screen on the site's backend menu.

@harryjmoss a sudden thought - could the sql errors be related to recent(ish) database changes made during the other work?