Closed kubu4 closed 1 month ago
Thanks for the detailed report! I've been traveling and am just now sitting down to look at this. Based on your other issue (https://github.com/RobertsLab/resources/issues/1941) I looked for the assembly (found at https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/016/432/855/GCF_016432855.1_SaNama_1.0/) and downloaded the RefSeq FNA to try and reproduce this myself.
Using pyfaidx 0.8.1.1 under Ubuntu 23.04 I downloaded the assembly and ran these commands:
$ faidx GCF_016432855.1_SaNama_1.0_genomic.fna NC_052339.1 | tee buffer.fa
(output omitted)
$ faidx --transform nucleotide buffer.fa
name start end A T C G N others
NC_052339.1 1 37339918 10586364 10549632 8085804 8112718 5400
I think this is basically what your script is doing - creating buffer files consisting of one sequence, in this case NC_052339.1, and then using the --transform flag to get the nucleotide stats. I don't see anything wrong with this, and looks like I can't reproduce the issue on my end. I'd point of that you can accomplish the same task without the buffer files though:
$ faidx GCF_016432855.1_SaNama_1.0_genomic.fna NC_052339.1 --transform nucleotide
name start end A T C G N others
NC_052339.1 1 37339918 10586364 10549632 8085804 8112718 5400
Maybe try this method in your script? For now I'm closing this as it's apparently not an issue with the pyfaidx code. I'm glad to help work on this more as well if you need.
Thanks so much for taking the time to look at, and respond, to this!
Also, thanks so much for this:
$ faidx GCF_016432855.1_SaNama_1.0_genomic.fna NC_052339.1 --transform nucleotide
I must've overlooked this capability!
I'm using PyFaidx CLI to extract a region from a genome FastA file.
Then, I'm attempting to use the PyFaidx
--transform nucleotide
command to get some stats, but I get aKeyErro
when it attempts to process the FastA.I'm running the following:
This results in the following output/error:
Here is the head of the FastA produced by PyFaidx (
"${output_top}/${c1q_buffer_fasta}"
):Here's the corresponding FastA index file:
I have minimal experience with Python, so am not entirely sure where to begin troubleshooting this. Any help/insight would be greatly appreciated. Thanks.
EDITED: Removed unnecessary use of
samtools
.