faylward / viralrecall

Detection of NCLDV signatures in 'omic data
30 stars 12 forks source link

.vregion.annot.tsv #30

Open Changle-cell opened 2 days ago

Changle-cell commented 2 days ago

Sorry to bother you. The output file '.vregion.annot.tsv' only saved results from one replicon. It's not what you want, isn't it? It seems you made mistake in line517: df3 = df2.loc[df2['replicon'] == rep] or maybe you should add the results in tsv file ranther than replace the results in line528 and line530: subset.to_csv(os.path.join(base, relpathbase+".vregion_annot.tsv"), sep='\t', index_label="protein_ids") Look forward to your reply soon. Thank you.

faylward commented 19 hours ago

Do you have a sample input? I have used this with multiple replicons many times, so i cannot reproduce that error.

On Tue, Oct 22, 2024, 9:54 AM Changle-cell @.***> wrote:

Sorry to bother you. The output file '.vregion.annot.tsv' only saved results from one replicon. It's not what you want, isn't it? It seems you made mistake in line517: df3 = df2.loc[df2['replicon'] == rep] or maybe you should add the results in tsv file ranther than replace the results in line528 and line530: subset.to_csv(os.path.join(base, relpathbase+".vregion_annot.tsv"), sep='\t', index_label="protein_ids") Look forward to your reply soon. Thank you.

— Reply to this email directly, view it on GitHub https://github.com/faylward/viralrecall/issues/30, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPFN7UWE4BFWVT6JMGXJW3Z4ZKKRAVCNFSM6AAAAABQMSH53KVHI2DSMVQWIX3LMV43ASLTON2WKOZSGYYDKNJTG4YTQNI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Changle-cell commented 18 hours ago

You can just merger the 'examples' to a single file with '$cat', and run the python script. You will find that the output file '.vregion.annot.tsv' only contains results from one of the two genomes.

faylward commented 15 hours ago

Use the -c flag to get all results. By default only NCLDV regions are listed in the output.

On Thu, Oct 24, 2024, 8:04 AM Changle-cell @.***> wrote:

You can just merger the 'examples' to a single file with '$cat', and run the python script. You will find that the output file '.vregion.annot.tsv' only contains results from one of the two genomes.

— Reply to this email directly, view it on GitHub https://github.com/faylward/viralrecall/issues/30#issuecomment-2435094843, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPFN7X2YFIB3B6CUUY53W3Z5DO5LAVCNFSM6AAAAABQMSH53KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZVGA4TIOBUGM . You are receiving this because you commented.Message ID: @.***>

Changle-cell commented 3 hours ago

Sorry, did you mean the script will only treat no more than one of the input sequences in a fasta file as being with some viral signatures without '-c' flag?

As I have used the script in 20 inputs and all the output 'vregion.annot.tsv' files contain no result or only results from one of the input sequences, I don 't think that is a coincidence.

If you need sample inputs, you can download from Chlorokybus_atmophyticus and Klebsormidium_nitens