DailyDreaming / load-project

1 stars 0 forks source link

Row length assertions fire #145

Closed hannes-ucsc closed 4 years ago

hannes-ucsc commented 4 years ago

While running

SKUNK_ACCESSIONS="GSE110499,GSE81383,GSE128639,GSE94820,GSE93374,GSE75688,GSE127969,GSE131181" make matrices

I got

2020-01-24 02:59:52,754 MainProcess.18033 ERROR: Failed to process project projects/GSE94820
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/lib/python3.6/concurrent/futures/process.py", line 175, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "convert_matrices.py", line 138, in convert
    self._convert()
  File "convert_matrices.py", line 899, in _convert
    row_filter=self._fix_short_rows(1245)
  File "convert_matrices.py", line 191, in _convert_matrices
    input_.to_mtx(input_dir=self.geo_dir, output_dir=output_dir)
  File "convert_matrices.py", line 68, in to_mtx
    converter.convert(output_dir)
  File "/home/ubuntu/load-project/csv2mtx.py", line 80, in convert
    write_gzip_file(mtx_body_file, self)
  File "/home/ubuntu/load-project/csv2mtx.py", line 178, in write_gzip_file
    for line in lines:
  File "/home/ubuntu/load-project/csv2mtx.py", line 54, in __iter__
    filter_status = self.row_filter(row)
  File "convert_matrices.py", line 209, in _fix_short_rows_filter
    assert False, len(row)
AssertionError: 1140
"""

2020-01-24 02:56:54,387 MainProcess.18033 ERROR: Failed to process project projects/GSE110499
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/lib/python3.6/concurrent/futures/process.py", line 175, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "convert_matrices.py", line 138, in convert
    self._convert()
  File "convert_matrices.py", line 654, in _convert
    row_filter=self._fix_short_rows(172)
  File "convert_matrices.py", line 191, in _convert_matrices
    input_.to_mtx(input_dir=self.geo_dir, output_dir=output_dir)
  File "convert_matrices.py", line 68, in to_mtx
    converter.convert(output_dir)
  File "/home/ubuntu/load-project/csv2mtx.py", line 80, in convert
    write_gzip_file(mtx_body_file, self)
  File "/home/ubuntu/load-project/csv2mtx.py", line 178, in write_gzip_file
    for line in lines:
  File "/home/ubuntu/load-project/csv2mtx.py", line 54, in __iter__
    filter_status = self.row_filter(row)
  File "convert_matrices.py", line 209, in _fix_short_rows_filter
    assert False, len(row)
AssertionError: 173
"""

Look like regressions from https://github.com/DailyDreaming/load-project/commit/602d20ce574229fee989aa7f2e015b4507e9131f and https://github.com/DailyDreaming/load-project/commit/1a209acad0602528d50690e7ca22a84054746f06.

It appears that these changes were made without testing them.

If that is the case, it would save the author's and my own time if changes were tested. If I am wrong, I would like to find out where the ball was dropped.