PNNL-CompBio / Snekmer

Pipeline to apply encoded Kmer analysis to protein sequences
BSD 3-Clause "New" or "Revised" License
12 stars 1 forks source link

[skip ci][bugfix] Fixed workflow-breaking bugs in cluster code #81

Closed christinehc closed 1 year ago

christinehc commented 1 year ago

Previous code throws a RuleException: KeyError in line 162 [...] 'kmerlist is not a file in the archive'

This occurs because snekmer.io.load_npz had been modified to return a tuple of a DataFrame object and a list object, but the code had been incorrectly implemented such that the list object was not being correctly read from the input file. Note that npz objects in Snekmer no longer store lists of kmers / the kmer basis set, and thus have no 'kmerlist' object.

The code was updated to remove the second returned item from snekmer.io.load_npz (thus, only a DataFrame is returned, as before) and the code in cluster.smk no longer relies on this function to pull the kmerlist. Instead, the kmer basis object is loaded separately and read to obtain the kmers.