nerve-bio / NERVE

NERVE is an user-friendly software environment for the in silico identification of the best vaccine candidates from whole proteomes of bacterial pathogens. The purpose of this project is to update it, implementing new modules with machine learning based methods, and improving the performance of the already implemented ones.
MIT License
5 stars 5 forks source link

ValueError: /home/output/proteome1.fasta is not in fasta format #26

Open Ayman344 opened 9 months ago

Ayman344 commented 9 months ago

image

Please See the image and kindly instruct what to do in this kind problem. Why is the output file also wanting a file to be Fasta? If so how to fix it? I am trying to upload a Protein Sequence fasta file as input. wrote config file /root/.config/epitopepredict/default.conf Start NERVE 2.0 10% done Traceback (most recent call last): File "/usr/nerve_python/NERVE/code/NERVE.py", line 506, in main() File "/usr/nerve_python/NERVE/code/NERVE.py", line 358, in main list_of_fasta_proteins, proteome1_new_path = quality_control(args.proteome1, args.working_dir, upload=True) File "/workdir/code/Quality_control.py", line 127, in quality_control fasta_list = is_fasta(path_to_fasta) File "/workdir/code/Quality_control.py", line 103, in is_fasta raise ValueError(f'{filename} is not in fasta format') ValueError: /home/output/proteome1.fasta is not in fasta format

FranceCosta commented 8 months ago

Could you provide the content of your input file and of the log file?

Ayman344 commented 8 months ago

Could you provide an email address? So that I can provide them to you.

FranceCosta commented 8 months ago

Would you be able to provide a wetransfer link instead (https://wetransfer.com/)?

Ayman344 commented 7 months ago

I will be sharing with you the files. I am taking time because I am trying to recreate the problem.

abrozzi commented 4 months ago

Hi, got same error with this file:

>sp|A5A616|MGTS_ECOLI Small protein MgtS OS=Escherichia coli (strain K12) OX=83333 GN=mgtS PE=1 SV=1
MLGNMNVFMAVLGIILFSGFLAAYFSHKWDD
>sp|O32583|THIS_ECOLI Sulfur carrier protein ThiS OS=Escherichia coli (strain K12) OX=83333 GN=thiS PE=1 SV=1
MQILFNDQAMQCAAGQTVHELLEQLDQRQAGAALAINQQIVPREQWAQHIVQDGDQILLF
QVIAGG
>sp|P00350|6PGD_ECOLI 6-phosphogluconate dehydrogenase, decarboxylating OS=Escherichia coli (strain K12) OX=83333 GN=gnd PE=1 SV=2
MSKQQIGVVGMAVMGRNLALNIESRGYTVSIFNRSREKTEEVIAENPGKKLVPYYTVKEF
VESLETPRRILLMVKAGAGTDAAIDSLKPYLDKGDIIIDGGNTFFQDTIRRNRELSAEGF
NFIGTGVSGGEEGALKGPSIMPGGQKEAYELVAPILTKIAAVAEDGEPCVTYIGADGAGH
YVKMVHNGIEYGDMQLIAEAYSLLKGGLNLTNEELAQTFTEWNNGELSSYLIDITKDIFT
KKDEDGNYLVDVILDEAANKGTGKWTSQSALDLGEPLSLITESVFARYISSLKDQRVAAS
KVLSGPQAQPAGDKAEFIEKVRRALYLGKIVSYAQGFSQLRAASEEYNWDLNYGEIAKIF
RAGCIIRAQFLQKITDAYAENPQIANLLLAPYFKQIADDYQQALRDVVAYAVQNGIPVPT
FSAAVAYYDSYRAAVLPANLIQAQRDYFGAHTYKRIDKEGVFHTEWLD
>sp|P00363|FRDA_ECOLI Fumarate reductase flavoprotein subunit OS=Escherichia coli (strain K12) OX=83333 GN=frdA PE=1 SV=3
MQTFQADLAIVGAGGAGLRAAIAAAQANPNAKIALISKVYPMRSHTVAAEGGSAAVAQDH
DSFEYHFHDTVAGGDWLCEQDVVDYFVHHCPTEMTQLELWGCPWSRRPDGSVNVRRFGGM
KIERTWFAADKTGFHMLHTLFQTSLQFPQIQRFDEHFVLDILVDDGHVRGLVAMNMMEGT
LVQIRANAVVMATGGAGRVYRYNTNGGIVTGDGMGMALSHGVPLRDMEFVQYHPTGLPGS
GILMTEGCRGEGGILVNKNGYRYLQDYGMGPETPLGEPKNKYMELGPRDKVSQAFWHEWR
KGNTISTPRGDVVYLDLRHLGEKKLHERLPFICELAKAYVGVDPVKEPIPVRPTAHYTMG
GIETDQNCETRIKGLFAVGECSSVGLHGANRLGSNSLAELVVFGRLAGEQATERAATAGN
GNEAAIEAQAAGVEQRLKDLVNQDGGENWAKIRDEMGLAMEEGCGIYRTPELMQKTIDKL
AELQERFKRVRITDTSSVFNTDLLYTIELGHGLNVAECMAHSAMARKESRGAHQRLDEGC
TERDDVNFLKHTLAFRDADGTTRLEYSDVKITTLPPAKRVYGGEADAADKAEAANKKEKA
NG

which is the head of a longer file downloaded from Uniprot: (taxonomy_id:83333) E. coli K12 proteome. Instead inputting the code -p1 UP000000625 seems to work.

HTH

FranceCosta commented 4 months ago

@abrozzi thanks for reporting this. Can you provide the full command that gave rise to the error?