Closed vinisalazar closed 4 months ago
Hi, I just ran KMA with the prebuilt refseq database and found that the resulting taxid order seems to be incorrect in the res file, e.g. ' NC_007508.1|taxid|316273 ' instead of ' 316273|NC_007508.1 ', which then throws an error if you just run CCMetagen.py directly on that. I think I can fix this just using awk on the res file, but thought I'd just let me know if this was connected to the issue above. Otherwise, I can provide a lot more details of what I did.
Hi!
Thanks for the info!
Could you try running CCMetagen with the flag -r RefSeq
or --reference_database RefSeq
?
This flag was done to take care of the different heading formats of the RefSeq database, but let me know if it works.
Ah, should have read the manual! That works, as did my awk bodge yesterday. Thanks.
I am glad it works =)
The new update database does not have taxonomic lineages in the header, only accession + taxid. Investigate whether that is a problem with the upstream database formatting scripts (e.g.
rename_nt.py
) or something else.TO-DO:
results.ccm.csv
file