Closed GeoMicroSoares closed 1 year ago
Hi @GeoMicroSoares
Thank you for the feedback and using (or trying too) it. I have added more information on how to set it up now and added test file that will run quickly.
Can you confirm if can get it to run with the test.fna file that is now provided and get the same result we do ?
python tax_myPHAGE.py -i test.fna -t 8
It looks like it is failing at the moment as it not similar enough to anything else in the database at the moment ...But, it shouldn`t just fail it should be giving a message that the input is likely a new Genus. So clearly something is going wrong ...that we havent expected.
Can you confirm the above first please. If than works can you run one of you sequences with
python tax_myPHAGE/tax_myPHAGE.py -t 10 -i viruses_oneGenomeFasta/viral_scaffolds_mgshot_S7938Nr1_lt70_checkv_noEuks.id_mgshot_S7938Nr1_27_length_378353_cov_15.fasta --Figures F -v
Using the -v so we get a bit more output. Does the " mash.txt" you get in your output directory have anything in it ?
@GeoMicroSoares Did your problem fixed with the new release?
@amillard Hi, I encountered a similar situation, I try to run
python tax_myPHAGE.py -i test.fna -t 8
but I couldn't find your test file(test.fna
),could you please help me to solve my problem
@778055611
Sorry the file has been moved around in the re-organisation there is as file UP30.fsa in Uploads folder . You can use that as a test file
UP30.fsa Query sequence is: Class: Caudoviricetes Family: Drexlerviridae Subfamily: Tunavirinae Genus: Tunavirus Species: Tunavirus new_name
MN478483 is Taipeivirus ICTV exemplar
taxmyphage -i MN478483.fasta -t 8
Number of phage genomes detected with mash distance of < 0.2 is:5
Classifying: 0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "~/.local/bin/taxmyphage", line 8, in
IndexError: list index out of range
it looks like you have problem with reading fasta defline in query file
I have changed from
gi|1774218871|gb|MN478483.1| Klebsiella phage UPM 2146, complete genome to MN478483 and now report looks fine
Ok so mostly it is because the genome identifier has characters that are not allowed in folder name. I'll try to modify that in the new release
Hi there @amillard,
I'm trying to apply this tool to my dataset of a couple hundreds of metagenome-recovered viruses (consensus viral sequences acc. to VIBRANT & VirSorter2 with >70% checkV completeness & checked for eukaryotic sequences). However, I keep getting a
IndexError: list index out of range
error. More output below - the genome I ran tried the tool with is 378,353bp as indicated in the name.As a recommendation by the way, it would be great to be able to direct the output to a specific directory (maybe via the prefix option), and more explicit information as to how to set up the tool & databases would also be helpful (that all databases should be in the cloned directory, for example). Thank you for making this tool available & in advance for the help - looking forward to checking my data out with it!