andersen-lab / Freyja

Depth-weighted De-Mixing
BSD 2-Clause "Simplified" License
100 stars 29 forks source link

Freyja plot failures with low-coverage samples #203

Closed kevinlibuit closed 5 months ago

kevinlibuit commented 5 months ago

Freyja Plot will filter samples out based on the --min_cov input value. If all the data in a set have coverages below this threshold, all of these samples get filtered out leaving nothing for the Freyja workflow to plot. This results in an error as Freyja plot works to parse the empty aggregated file:

Traceback (most recent call last):
File "/opt/conda/envs/freyja-env/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3652, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 147, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 176, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 7080, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 7088, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'summarized'

Raising this as an issue as it took me some time to figure out what exactly was happening.

A potential update to avoid this could be some kind of graceful failure that would ID an empty aggregate demixed file that exists the execution and displays some kind of error indicating the problem.

joshuailevy commented 5 months ago

Hey @kevinlibuit!

We actually sorted that out a couple of weeks ago (in connection with this issue #166 ), but have not yet made a new release. The update can be seen here: https://github.com/andersen-lab/Freyja/blob/1fa14df1ad2512cb50620cff4296d6df4107b5e7/freyja/_cli.py#L378

Assuming this covers your needs, I'll go ahead and make the release. Let me know if you'd like the failure result to look different from this.

Josh

kevinlibuit commented 5 months ago

Ah, this is perfect! Sorry I missed that closed issue.