AlexsLemonade / refinebio

Refine.bio harmonizes petabytes of publicly available biological data into ready-to-use datasets for cancer researchers and AI/ML scientists.
https://www.refine.bio/
Other
129 stars 19 forks source link

Processor job failed due to UnicodeDecodeError #3351

Open arkid15r opened 1 year ago

arkid15r commented 1 year ago

Context

The e2e tests run failed w/ the following error:

Problem or idea

"2023-08-11 01:21:52,207 i-0ce37044d5ae9d30a data_refinery_workers.processors.utils ERROR [processor_job: 285555] [no_retry: False]: Unhandled exception caught while running processor function _determine_index_length in pipeline: "
Traceback (most recent call last):
"  File ""/home/user/data_refinery_workers/processors/utils.py"", line 405, in run_pipeline"
    last_result = processor(last_result)
"  File ""/home/user/data_refinery_workers/processors/salmon.py"", line 284, in _determine_index_length"
    for line in process.stdout:
"  File ""/usr/lib/python3.8/codecs.py"", line 322, in decode"
"    (result, consumed) = self._buffer_decode(data, self.errors, final)"
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x88 in position 8: invalid start byte
"2023-08-11 01:21:52,584 i-0ce37044d5ae9d30a data_refinery_workers.processors.utils ERROR [no_retry: False] [pipeline_applied: SALMON] [failure_reason: Unhandled exception caught while running processor function _determine_index_length in pipeline: 'utf-8' codec can't decode byte 0x88 in position 8: invalid start byte] [processor_job: 285555]: Processor job failed!"

Solution or next step

Find out the root cause of the issue and resolve it.