Open timothydereuse opened 1 year ago
I am looking into creating checks for specific jobs and would like to know what are the common errors that you get when working on Rodan. As people who frequently work on E2E Rodan workflows, can you list some common issues that you would like to see specific messages for (ie. wrong file input, empty pages, etc) @JoyfulGen @martha-thomae
I am trying my best to remember some of these issues. Right now I can only think of one:
I think that when I get to test the end-to-end OMR workflow again, I will think of more things besides this and what Tim already said about the staff processing.
I've been noodling around with simple mistakes I can imagine new or distracted users might make, and here's what I've got so far:
Mistake: Assigning the same model to two different input ports in the Fast Pixelwise Analysis job. Result: MEI_encoding job fails, with the following traceback error:
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 412, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 704, in __protected_call__
return self.run(*args, **kwargs)
File "/code/Rodan/rodan/jobs/base.py", line 773, in run
retval = self.run_my_task(inputs, settings, arg_outputs)
File "/code/Rodan/rodan/jobs/MEI_encoding/MEI_encoding.py", line 85, in run_my_task
mei_string = bm.process(jsomr, syls, classifier_table, width_mult, width_container)
File "/code/Rodan/rodan/jobs/MEI_encoding/build_mei_file.py", line 718, in process
meiDoc = build_mei(pairs, classifier, width_container, jsomr['staves'], jsomr['page'])
File "/code/Rodan/rodan/jobs/MEI_encoding/build_mei_file.py", line 503, in build_mei
bb = staves[0]['bounding_box']
IndexError: list index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 429, in trace_task
I, R, state, retval = on_error(task_request, exc, uuid)
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 366, in on_error
task, request, eager=eager, call_errbacks=call_errbacks,
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 173, in handle_error_state
call_errbacks=call_errbacks)
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 221, in handle_failure
task.on_failure(exc, req.id, req.args, req.kwargs, einfo)
File "/code/Rodan/rodan/jobs/base.py", line 1015, in on_failure
and user.user_preference.sned_email
AttributeError: 'UserPreference' object has no attribute 'sned_email'
Mistake: Assigning a previously generated symbol layer as the original folio image. Result: MEI_encoding job fails, with the following traceback error:
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 704, in __protected_call__
return self.run(*args, **kwargs)
File "/code/Rodan/rodan/jobs/base.py", line 773, in run
retval = self.run_my_task(inputs, settings, arg_outputs)
File "/code/Rodan/rodan/jobs/MEI_encoding/MEI_encoding.py", line 85, in run_my_task
mei_string = bm.process(jsomr, syls, classifier_table, width_mult, width_container)
File "/code/Rodan/rodan/jobs/MEI_encoding/build_mei_file.py", line 718, in process
meiDoc = build_mei(pairs, classifier, width_container, jsomr['staves'], jsomr['page'])
File "/code/Rodan/rodan/jobs/MEI_encoding/build_mei_file.py", line 503, in build_mei
bb = staves[0]['bounding_box']
IndexError: list index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 429, in trace_task
I, R, state, retval = on_error(task_request, exc, uuid)
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 366, in on_error
task, request, eager=eager, call_errbacks=call_errbacks,
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 173, in handle_error_state
call_errbacks=call_errbacks)
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 221, in handle_failure
task.on_failure(exc, req.id, req.args, req.kwargs, einfo)
File "/code/Rodan/rodan/jobs/base.py", line 1015, in on_failure
and user.user_preference.sned_email
AttributeError: 'UserPreference' object has no attribute 'sned_email'
Mistake: In the NIC job, assigning the split_features file to the training data port and vice-versa. Result: NIC failed, with the following traceback error.
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 704, in __protected_call__
return self.run(*args, **kwargs)
File "/code/Rodan/rodan/jobs/base.py", line 773, in run
retval = self.run_my_task(inputs, settings, arg_outputs)
File "/code/Rodan/rodan/jobs/gamera_rodan/wrappers/classification.py", line 91, in run_my_task
cknn = gamera.knn.kNNNonInteractive(tempPath)
File "/usr/local/lib/python3.7/site-packages/gamera/knn.py", line 686, in __init__
classify.NonInteractiveClassifier.__init__(self, database, perform_splits)
File "/usr/local/lib/python3.7/site-packages/gamera/classify.py", line 515, in __init__
self.from_xml_filename(database)
File "/usr/local/lib/python3.7/site-packages/gamera/classify.py", line 427, in from_xml_filename
self._from_xml(stream)
File "/usr/local/lib/python3.7/site-packages/gamera/classify.py", line 432, in _from_xml
self.set_glyphs(database)
File "/usr/local/lib/python3.7/site-packages/gamera/classify.py", line 558, in set_glyphs
self.instantiate_from_images(self.database, self.normalize)
ValueError: Initial database of a non-interactive kNN classifier must have at least one element.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 429, in trace_task
I, R, state, retval = on_error(task_request, exc, uuid)
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 366, in on_error
task, request, eager=eager, call_errbacks=call_errbacks,
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 173, in handle_error_state
call_errbacks=call_errbacks)
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 221, in handle_failure
task.on_failure(exc, req.id, req.args, req.kwargs, einfo)
File "/code/Rodan/rodan/jobs/base.py", line 1015, in on_failure
and user.user_preference.sned_email
AttributeError: 'UserPreference' object has no attribute 'sned_email'
There are also mistakes the user can make that don't provoke a failure of the workflow, but mess up the final results. Let me know if those would be useful to know also!
This is mildly unrelated but I'm assuming the sned_email
is a typo?
(in rodan-main/code/rodan/jobs/base.py)
@JoyfulGen if this is too much work do not worry about it at all, but for the first two errors, do you happen to have the MEI encoding inputs that cause the error outputs? Otherwise I can re-create it!
@sabrina0822 here they are:
Assigned the music_symbol model to both layer 1 and layer 2 input ports of the Fast_pixelwise job 129r_same_model_twice_PF.json.zip 129r_same_model_twice_TA.json.zip
Used a previously generated symbol layer as the original image using_symbol_layer_as_image_PF.json.zip using_symbol_layer_as_image_TA.json.zip
Currently when a job fails, the only feedback the user gets is a Python error traceback. In addition to this traceback, there should be some kind of message that the job can pass back to the client, somehow, that the user can see.
For example - if the staff-finding job is capable of failing because it's given a blank image, there should be a check for that in the job itself, and when it throws an exception the message from the exception should be delivered back to the user in a friendlier and easier-to-read way than just the python stack trace. Hopefully this could lead to people being able to figure out why a job went wrong on their own even if they do not know how to interpret the error message.