Closed philippbayer closed 2 years ago
The offending line is here:
It looks like it's a buffering issue: if there's no newline in the input string, something gets lost in the data transfer and taxonkit doesn't detect the input. Adding a second value fixes the buffering issue, as you discovered. I'm unsure why single taxids with multiple digits work but single digits don't: possibly buffering related as well?
In any case, adding a newline to the end of the input like so seems to fix the problem. PR incoming.
idlist = '\n'.join(map(str, ids)) + '\n'
Thanks!!! It might have to do with subprocess' buffering mode, with one number we have only one line - quote
0 means unbuffered (read and write are one system call and can return short)
1 means line buffered (only usable if universal_newlines=True i.e., in a text mode)```
I guess the buffer's default size is larger than whatever space the numbers 1 to 9 take up, and adding a '\n' forces the line buffered mode? Anyways, it works now! Thank you so much, that was very fast
Thanks for your patience while this slipped through the cracks for a few days. I've played around a bit with the bufsize
parameter and wasn't able to get it to work with single character inputs, so I think we'll stick with the solution implemented in #26.
When running
pytaxonkit.name()
on a single taxonomy ID below 10, pytaxonkit returns an empty DataFrame.I'm using pytaxonkit v0.7.2 in Python 3.10.4, both installed via conda with taxonkit v0.7.2.
Example:
These IDs work fine in bash:
Interestingly, when I add more taxonomy IDs the problem goes away.
I am unsure what causes this. My current workaround is to add a taxonomy ID to the list when
len(list) == 1
, and then remove the last row.Looking at the code in https://github.com/bioforensics/pytaxonkit/blob/d24a1a5b6b295771c6485e3d5b951a0f66fce957/pytaxonkit.py#L410, it seems to ignore the
err
variable which is this when the taxonomy ID is < 10:Isn't that interesting! I'm unsure how or why this happens.