kbaseapps / kb_gtdbtk

GTDB-Tk App
MIT License
4 stars 9 forks source link

PTV-1904 fix column mismatch when merging TSV files #90

Closed briehl closed 9 months ago

briehl commented 9 months ago

There's a step in the output file processing where TSV files can get merged. This can throw an error in the edge case where these files have a mismatched number of columns in one or more rows, i.e. when a row has empty columns at the end, which get truncated during processing.

The current fix is to use the maximum number of columns found during processing.

The real fix would be to load the files as a list of dictionaries where each key is a column header and each value is the value of that row, then merge those. That might get preserved for future work.