Open bluegenes opened 1 year ago
oh nice, thank you for reporting! I didn't know n_unique_weighted_found
was added in v4.5+! let me noodle on this for a couple days and then I'll implement a fix. I will def switch to that naming scheme and only calculate that column if it isn't already in the output file...need to think if there are other things I can do to "catch" this. thank you again!
Here's where we calculate n_unique_weighted_found
, in case it's helpful:
https://github.com/sourmash-bio/sourmash/blob/latest/src/sourmash/search.py#L496-L510
Had an error shared with me (🎉):
I see that the
n_unique_kmers
column is added duringread_taxonomy_annotate
, so the error is likely caused by usingread_csv
rather thanread_taxonomy_annotate
to read the file.Would it be worth changing this internal column to
n_unique_weighted_found
to avoid this error for sourmash v4.5+, since we have the column now? We figured this name more clearly described the column info, but I'm not sure we discussed outside of the sourmash PR that added it.Or if you want to force folks to use
read_taxonomy_annotate
(I see you do a couple other things in there) is there a way to catch the error + suggest the solution?thanks for the awesome software!