Open 000generic opened 6 years ago
Hi Eric,
Does the fast a file have a “>” for each heading?
Kai Battenberg
iPhoneから送信
2018/02/16 13:34、Eric Edsinger notifications@github.comのメール:
I am getting the following error when I run just the first perl script:
[eedsinger@cluster5 orthored-flatworm-seasquirt-octopus-human-fly-worm]$ bash 01-orthored-6-species RUNNING SCRIPT: /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly-worm/OrthoReD_v20170412 step-01-02.pl (Procedure-00)Setting up the environment. Following species are included in the species list: Botryllus schlosseri Caenorhabditis elegans Drosophila melanogaster Homo sapiens Octopus bimaculoides Schmidtea mediterranea Following files are included in the query: ./proteome_Mollusca_Octopus_bimaculoides.okay.aa (Procedure-01)Generate one quality-checked query file in single-line FASTA format. Sequences with undetermined reading frame are removed. ERROR: temp3.fas is not in single-line FASTA format. Check line 1.
The query input file is a single line fasta file (proteome_Mollusca_Octopus_bimaculoides.okay.aa) that I previously structured per your instructions and looks like:
Ob_Ocbimv22002748m_na_na MEPLELEETVIATLLADSEIKSVMTNVIQELPVIAQDTRMKAVGDEYITVSVIRVSVIGVSVIGVSVIRAPVIRVTVTRISFIRVSFIRVSVIGVSVIGVSVIGVSVIRISVIRVSFIRVSVIRVLFIRVSVISLSVWPKMLAPRDYPEYLYPFFPFLFLPFSQECFQKRLQMF Ob_Ocbimv22002861m_na_na MPCSLVIKKEEATRITTVGGDYTAFRPWIDNDSKSLQGTVYVPGKYPIQNMYQNSLNTNQNGLPFVTPPPPVSQDTSRSLFTDIYLTKKNVYGDQVVAPPLPFTLDYQFWYQNPLNKPNSYPENLNFLECNGSIVTSSTPSLPSSSTSSPMLTVNPTSSISSVSSSASSSSTLASVSSISSSYPNSLSTSSTLASSSSSSSSLSSSSSSSSSSSSSFSSNLLSSSQGSSAIPELRSVPDGGSEHLSDNGYNNYGNSNNNIIINNNNNNNNNSISNINSIHVDRQAIHYDILNNNPNNNINNNNNNINNNNNKSCSNNNNSNNNNSNSNNNNTNNSCHLHSSMAAPDSTSSTFECINCNKLFGTPHGLEVHVRRSHTGSRPYACDVCQKTFGHAVSLSHHRSVHTQERTFECQQCGKSFKRSSTLSTHLLIHSDTRPYPCPYCGKRFHQKSDMKKHTYIHTGEKPHRCLQCGKAFSQSSNLITHSRKHTGFKPFACDKCGRAFQRKVDLRRHTETQHLNSSLTKQASLLRVVSVQEHVTSL Ob_Ocbimv22003975m_na_na MDKLCSVQSKLNCIFEIAVTNEPSSKHSNQLYMVSATDKLRPDAKGHNLETALLTEKVDGTCAYVAEFKDRPWLWARHDRKPKKSAEKEFRKFQNEQLDKDATFQWNFEQDFKPFPEHWIPATGVEVKDGVVYPDQNGHTPGWVPIDVNSKQYCWHLESVNLKQGTALLLKETENTALKICLVPLKDILNHTAELIGTSVNGNPYGLGSKKFPFHILIVHGSIKVSYTSEMKRENFLSWMKSDPNGAVEGIVWHCDDGALFKVSHL
temp1.fas looks ok in its structure:
Ob_Ocbimv22002748m_na_na MEPLELEETVIATLLADSEIKSVMTNVIQELPVIAQDTRMKAVGDEYITVSVIRVSVIGVSVIGVSVIRAPVIRVTVTRISFIRVSFIRVSVIGVSVIGVSVIGVSVIRISVIRVSFIRVSVIRVLFIRVSVISLSVWPKMLAPRDYPEYLYPFFPFLFLPFSQECFQKRLQMF Ob_Ocbimv22002861m_na_na MPCSLVIKKEEATRITTVGGDYTAFRPWIDNDSKSLQGTVYVPGKYPIQNMYQNSLNTNQNGLPFVTPPPPVSQDTSRSLFTDIYLTKKNVYGDQVVAPPLPFTLDYQFWYQNPLNKPNSYPENLNFLECNGSIVTSSTPSLPSSSTSSPMLTVNPTSSISSVSSSASSSSTLASVSSISSSYPNSLSTSSTLASSSSSSSSLSSSSSSSSSSSSSFSSNLLSSSQGSSAIPELRSVPDGGSEHLSDNGYNNYGNSNNNIIINNNNNNNNNSISNINSIHVDRQAIHYDILNNNPNNNINNNNNNINNNNNKSCSNNNNSNNNNSNSNNNNTNNSCHLHSSMAAPDSTSSTFECINCNKLFGTPHGLEVHVRRSHTGSRPYACDVCQKTFGHAVSLSHHRSVHTQERTFECQQCGKSFKRSSTLSTHLLIHSDTRPYPCPYCGKRFHQKSDMKKHTYIHTGEKPHRCLQCGKAFSQSSNLITHSRKHTGFKPFACDKCGRAFQRKVDLRRHTETQHLNSSLTKQASLLRVVSVQEHVTSL Ob_Ocbimv22003975m_na_na MDKLCSVQSKLNCIFEIAVTNEPSSKHSNQLYMVSATDKLRPDAKGHNLETALLTEKVDGTCAYVAEFKDRPWLWARHDRKPKKSAEKEFRKFQNEQLDKDATFQWNFEQDFKPFPEHWIPATGVEVKDGVVYPDQNGHTPGWVPIDVNSKQYCWHLESVNLKQGTALLLKETENTALKICLVPLKDILNHTAELIGTSVNGNPYGLGSKKFPFHILIVHGSIKVSYTSEMKRENFLSWMKSDPNGAVEGIVWHCDDGALFKVSHL
but temp2.fas and temp3.fas both seem to lack newline characters and look like this (temp3.fas):
OB_OCBIMV22002748M_NA_NAMEPLELEETVIATLLADSEIKSVMTNVIQELPVIAQDTRMKAVGDEYITVSVIRVSVIGVSVIGVSVIRAPVIRVTVTRISFIRVSFIRVSVIGVSVIGVSVIGVSVIRISVIRVSFIRVSVIRVLFIRVSVISLSVWPKMLAPRDYPEYLYPFFPFLFLPFSQECFQKRLQMFOB_OCBIMV22002861M_NA_NAMPCSLVIKKEEATRITTVGGDYTAFRPWIDNDSKSLQGTVYVPGKYPIQNMYQNSLNTNQNGLPFVTPPPPVSQDTSRSLFTDIYLTKKNVYGDQVVAPPLPFTLDYQFWYQNPLNKPNSYPENLNFLECNGSIVTSSTPSLPSSSTSSPMLTVNPTSSISSVSSSASSSSTLASVSSISSSYPNSLSTSSTLASSSSSSSSLSSSSSSSSSSSSSFSSNLLSSSQGSSAIPELRSVPDGGSEHLSDNGYNNYGNSNNNIIINNNNNNNNNSISNINSIHVDRQAIHYDILNNNPNNNINNNNNNINNNNNKSCSNNNNSNNNNSNSNNNNTNNSCHLHSSMAAPDSTSSTFECINCNKLFGTPHGLEVHVRRSHTGSRPYACDVCQKTFGHAVSLSHHRSVHTQERTFECQQCGKSFKRSSTLSTHLLIHSDTRPYPCPYCGKRFHQKSDMKKHTYIHTGEKPHRCLQCGKAFSQSSNLITHSRKHTGFKPFACDKCGRAFQRKVDLRRHTETQHLNSSLTKQASLLRVVSVQEHVTSLOB_OCBIMV22003975M_NA_NAMDKLCSVQSKLNCIFEIAVTNEPSSKHSNQLYMVSATDKLRPDAKGHNLETALLTEKVDGTCAYVAEFKDRPWLWARHDRKPKKSAEKEFRKFQNEQLDKDATFQWNFEQDFKPFPEHWIPATGVEVKDGVVYPDQNGHTPGWVPIDVNSKQYCWHLESVNLKQGTALLLKETENTALKICLVPLKDILNHTAELIGTSVNGNPYGLGSKKFPFHILIVHGSIKVSYTSEMKRENFLSWMKSDPNGAVEGIVWHCDDGALFKVSHLOB_OCBIMV22007001M_NA_NAMVIIVMMMVKLVMSDNDDDDGRGREGGRGEEGQRGEKEEVEKRKEEEEERKRIRKMRKRKRRQRKGRKKKRRRKRKIKKKRRKRKIKRKRRKRKRRKQOB_OCBIMV22007914M_NA_NAMRDEMANLYKKTHPWSYIPWNIIDLPFLYKEPPQKTLPDFSNDEIYFDVGLGKRGLPFSLDIAKRREKPLLTKGLLIKKLELLAQRAHHLELPEDNDKRRQSLAHSOB_OCBIMV22008886M_NA_NAMTSLRLFVVLTVVPSIIFVLSTTLVDSASHENRRLSILVFGGNGFIGSATVSRLLKTDHSITIINRGNWYWDSNVLIKPYVRHLKCDRMQSLYNCQDLVDFFKTSSSIYFDAIIDFSAYHPFAVREVLAIFRSKVGLYVLISTDSVYDVCMKNHTAPSKETDAVRPFAETMREDYAKNDNYGHLKLQCEEELQQQPEADKIPFLIYRLPDVIGPKDNTYRWWLYQLWMKIRTYLERPVSLPANLVKQEMSLVYVEDVADIIVQYLTGSEDINNEAYNLAIDETPTLYEVLSDIKDSLNLTDLNIFIEPLTSSSIYLFPSVKLGPVDVSKAKEKLNWKPTSWEKILKEIIAFYEKAIKEEKYEIPRRDVIHMMQKHLTRRPLQVLTGLRAVYGIDYPFVKEELOB_OCBIMV22008960M_NA_NAMNGVIKLRGRQRKERERERERERERERERERERERERERERERAKGGEWTKRRKKKQTRLREIKNERRRKERKKWRYFMSESESRGTRILEOB_OCBIMV22009214M_NA_NAMATDPLNMAGRNGIVVPSTSTRQAIAKSFQLEYCWFCGRPMDFFSLSSDNDHMEKLSELERLLAQAQNEKMHLIDEQVKQRESEMVALQEERLKREELERKLQEEALLREQLVQQQVQLREKQIQQARPLTRYLPIRNKDFDLRQHIEAAGHNLDSCPLVIVTMTSCRGYLQKMGSKFKTWHKRWFFFDRMKRSLLYYSDKNETKARGGIYFQAIEEVYVDHLRTVKSPNPKLTFCVKTYDRTYYLVGPSAEAMRIWIDVIFTGAEGYHTFOB_OCBIMV22009911M_NA_NAKKKVKTLHLILIIKSIVTVTHITTCRGSSCCTYNNDSNYCCDSNNAPNDTNNNRYFDKLSCTIVSFLCACCITSSGQCVCFIGVYNSSNACWAKAEKGDENCRKHLIFIVLLWLAVNNNSLLLLLWLLLDWISOB_OCBIMV22010443M_NA_NAMRRIHAITRSASKTKKPKYHDFLKNCLAEKFTKDIRRMFRNLENSETMARKGVMKRETOB_OCBIMV22011209M_NA_NAMNNFNYQSGKRNSQNAGVRRRGRKQLHEIQNNFNDDIQILEVSLPKRGPVKERPIEKIDLGVKENYSPPTKAIDQCSSEVPQVNYQRDDKLEISFLEGDRAEKSLDLGWESQEIKRNILLKLATDDDYSFTKTEECFIHERAKLYVVNMFPEKHYYYIPDNAEEQPQSIDSDEEILNEGQVFFPTSYSGKRKRRSYNNFNCYVTHHKNDDLYDNSHILPKGTRLLKPPNKLVELVAQAIDGSPDGLLQVHQIYTVLQNKYPYFRFMDRMAINSWRSSIRHALYQKWFRKIHFSTESINRKGCYWTINRQFSPKTWTMPGFQNISSVYFTTDSTENCTSNFQPETQLPEVEAPGPNNSITYEQCNSSNEFQPVINMPQEYSIAIRDDQQNEDEITTLIKDPDPILIPWSSESVTSVGHIENAALLSKDSQMSQLPISFQMNTVIDCEEMLNASPDTWLQQCSGSSDWPQNVDDASLCYPYTQQILLATFPLATSLANTOB_OCBIMV22011909M_NA_NAMVDSQCTVKELDQWIKQLYECNRLSETQVKTLCEKAKEILSQESNVQQVKCPVTVCGDIHGQFHDLMELFSIGGRIPDTNYLFMGDYVDRGYYSVETITLLVVLKVRYKDRITILRGNHESRQITQVYGFYDECLRKYGNANVWKFFTDLFDFLPLTALVDEEIFCLHGGLSPSICNLDNINTLFRKQEIPHEGPMCDLLWSDPGTHNGWSLSPRGLGYTFGKCISELFCYSNGLSLISRAHQLAMESCHFYFQFGVLGDIFNLFTHNIHVIWYGYTNDLTASVSLIQYDNIRLMVFGYMIRLKIIAPEYRYFSIFLOB_OCBIMV22012235M_NA_NAMVRRIHLISDELTEQLNTVSKKFAWFSLALDKSTDNQDTVQLRIFVRGIDENFVITEKLLVLESMKNTTTGQDLFECAVDCVEKSAVSWNRMASITTDGARVFTGKNVGMIKLLENKLKAEHPDGDILPFHYILRQKSFCKPALDLKHVVNPVMSMVNTIRIRAFYHLQFKSHLEDMEAQYGDVIYHNSVRWHVRSFKTKLGLFARKQKVPASVSSKIRDHWLSLEDEVTRRFQDFKKIEPDLNLLSYPLTADIDTAPEEVQLELIDMHFSCMKINEOB_OCBIMV22014287M_NA_NATPPPPVAVYNTEDGEEKNETVQSPIVIAVTATAVTVTNAAAATLIAAAVLVVVTVTPSTKETTTAIATTTTTMIAKVTTATTTKPIPTTTITTVAAAATTTTTRTLPTTKTLPQTQTTWIQEQTRTQRSKTADTTVTAHTDTTOB_OCBIMV22014735M_NA_NAMKYRITSRNCTYIDTHTESVETINKYTEEHTHQHTHTAFVERAINETMYKHTHTHNWIWKHSHTYSSMYTHTHTHTQPNTLTHIQVCTTSHQEVLTFIOB_OCBIMV22016260M_NA_NA
Folders and files produced include:
01_QUERY_QCD (empty) step-01-02_LOG.txt Step-01_Queries_RAW (contains a copy of the query input fasta file) Step-02_QUERY (empty) temp1.fas temp2.fas temp3.fas
the log text file that was produced reads:
RUNNING SCRIPT: /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly-worm/OrthoReD_v20170412 step-01-02.pl (Procedure-00)Setting up the environment. Following species were included in the species list: Botryllus schlosseri Caenorhabditis elegans Drosophila melanogaster Homo sapiens Octopus bimaculoides Schmidtea mediterranea Following files were included in the query: ./proteome_Mollusca_Octopus_bimaculoides.okay.aa (Procedure-01)Generate one quality-checked query file in single-line FASTA format. Sequences with undetermined reading frame were removed. step-01-02_LOG.txt (END)
And the command line run was:
perl ./OrthoReD_v20170412/step-01-02.pl --query ./proteome_Mollusca_Octopus_bimaculoides.okay.aa --q_seq_type AA --spp_list ./01-list-6-species
Any ideas or suggestions on what to correct would be greatly appreciated!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
Good call - I was staring at things so long I was no longer seeing all the details!
It now completes successfully if I set q_clean to NO - however if I set it to YES I get the following:
[eedsinger@cricket orthored-flatworm-seasquirt-octopus-human-fly-worm]$ bash 01-orthored-6-species RUNNING SCRIPT: /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly-worm/OrthoReD_v20170412 step-01-02.pl (Procedure-00)Setting up the environment. Following species are included in the species list: Botryllus schlosseri Caenorhabditis elegans Drosophila melanogaster Homo sapiens Octopus bimaculoides Schmidtea mediterranea Following files are included in the query: ./proteome_Mollusca_Octopus_bimaculoides.okay.aa (Procedure-01)Generate one quality-checked query file in single-line FASTA format. Sequences with undetermined reading frame are removed. ERROR: >Ob_Ocbimv22002748m_na_na could not be cleaned.
Prior to OrthoReD I structured the headers according to your guidelines - is this error showing up because the files look good and do not require cleaning - or is there likely something else causing trouble?
My query (and subject) fastas now look like:
Ob_Ocbimv22002861m_na_na MPCSLVIKKEEATRITTVGGDYTAFRPWIDNDSKSLQGTVYVPGKYPIQNMYQNSLNTNQNGLPFVTPPPPVSQDTSRSLFTDIYLTKKNVYGDQVVAPPLPFTLDYQFWYQNPLNKPNSYPENLNFLECNGSIVTSSTPSLPSSSTSSPMLTVNPTSSISSVSSSASSSSTLASVSSISSSYPNSLSTSSTLASSSSSSSSLSSSSSSSSSSSSSFSSNLLSSSQGSSAIPELRSVPDGGSEHLSDNGYNNYGNSNNNIIINNNNNNNNNSISNINSIHVDRQAIHYDILNNNPNNNINNNNNNINNNNNKSCSNNNNSNNNNSNSNNNNTNNSCHLHSSMAAPDSTSSTFECINCNKLFGTPHGLEVHVRRSHTGSRPYACDVCQKTFGHAVSLSHHRSVHTQERTFECQQCGKSFKRSSTLSTHLLIHSDTRPYPCPYCGKRFHQKSDMKKHTYIHTGEKPHRCLQCGKAFSQSSNLITHSRKHTGFKPFACDKCGRAFQRKVDLRRHTETQHLNSSLTKQASLLRVVSVQEHVTSL Ob_Ocbimv22003975m_na_na MDKLCSVQSKLNCIFEIAVTNEPSSKHSNQLYMVSATDKLRPDAKGHNLETALLTEKVDGTCAYVAEFKDRPWLWARHDRKPKKSAEKEFRKFQNEQLDKDATFQWNFEQDFKPFPEHWIPATGVEVKDGVVYPDQNGHTPGWVPIDVNSKQYCWHLESVNLKQGTALLLKETENTALKICLVPLKDILNHTAELIGTSVNGNPYGLGSKKFPFHILIVHGSIKVSYTSEMKRENFLSWMKSDPNGAVEGIVWHCDDGALFKVSHL Ob_Ocbimv22007001m_na_na MVIIVMMMVKLVMSDNDDDDGRGREGGRGEEGQRGEKEEVEKRKEEEEERKRIRKMRKRKRRQRKGRKKKRRRKRKIKKKRRKRKIKRKRRKRKRRKQ
Eric,
You want to keep the —q_clean as NO. That would only work if you happen to be using the organisms that I used in the paper.
By the way, you mentioned that you would be comparing OrthoReD with OrthoDB. In general, OrthoDB is more inclusive than OrthReD so keep that in mind.
Kai Battenberg
iPhoneから送信
2018/02/16 16:56、Eric Edsinger notifications@github.comのメール:
Good call - I was staring at things so long I was no longer seeing all the details!
It now completes successfully if I set q_clean to NO - however if I set it to YES I get the following:
[eedsinger@cricket orthored-flatworm-seasquirt-octopus-human-fly-worm]$ bash 01-orthored-6-species RUNNING SCRIPT: /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly-worm/OrthoReD_v20170412 step-01-02.pl (Procedure-00)Setting up the environment. Following species are included in the species list: Botryllus schlosseri Caenorhabditis elegans Drosophila melanogaster Homo sapiens Octopus bimaculoides Schmidtea mediterranea Following files are included in the query: ./proteome_Mollusca_Octopus_bimaculoides.okay.aa (Procedure-01)Generate one quality-checked query file in single-line FASTA format. Sequences with undetermined reading frame are removed. ERROR: >Ob_Ocbimv22002748m_na_na could not be cleaned.
Prior to OrthoReD I structured the headers according to your guidelines - is this error showing up because the files look good and do not require cleaning - or is there likely something else causing trouble?
My query (and subject) fastas now look like:
Ob_Ocbimv22002748m_na_na6-species All L1 (Shell-script[bash]) -------------------------------------------------------------------------------------- MEPLELEETVIATLLADSEIKSVMTNVIQELPVIAQDTRMKAVGDEYITVSVIRVSVIGVSVIGVSVIRAPVIRVTVTRISFIRVSFIRVSVIGVSVIGVSVIGVSVIRISVIRVSFIRVSVIRVLFIRVSVISLSVWPKMLAPRDYPEYLYPFFPFLFLPFSQECFQKRLQMF Ob_Ocbimv22002861m_na_na MPCSLVIKKEEATRITTVGGDYTAFRPWIDNDSKSLQGTVYVPGKYPIQNMYQNSLNTNQNGLPFVTPPPPVSQDTSRSLFTDIYLTKKNVYGDQVVAPPLPFTLDYQFWYQNPLNKPNSYPENLNFLECNGSIVTSSTPSLPSSSTSSPMLTVNPTSSISSVSSSASSSSTLASVSSISSSYPNSLSTSSTLASSSSSSSSLSSSSSSSSSSSSSFSSNLLSSSQGSSAIPELRSVPDGGSEHLSDNGYNNYGNSNNNIIINNNNNNNNNSISNINSIHVDRQAIHYDILNNNPNNNINNNNNNINNNNNKSCSNNNNSNNNNSNSNNNNTNNSCHLHSSMAAPDSTSSTFECINCNKLFGTPHGLEVHVRRSHTGSRPYACDVCQKTFGHAVSLSHHRSVHTQERTFECQQCGKSFKRSSTLSTHLLIHSDTRPYPCPYCGKRFHQKSDMKKHTYIHTGEKPHRCLQCGKAFSQSSNLITHSRKHTGFKPFACDKCGRAFQRKVDLRRHTETQHLNSSLTKQASLLRVVSVQEHVTSL Ob_Ocbimv22003975m_na_na MDKLCSVQSKLNCIFEIAVTNEPSSKHSNQLYMVSATDKLRPDAKGHNLETALLTEKVDGTCAYVAEFKDRPWLWARHDRKPKKSAEKEFRKFQNEQLDKDATFQWNFEQDFKPFPEHWIPATGVEVKDGVVYPDQNGHTPGWVPIDVNSKQYCWHLESVNLKQGTALLLKETENTALKICLVPLKDILNHTAELIGTSVNGNPYGLGSKKFPFHILIVHGSIKVSYTSEMKRENFLSWMKSDPNGAVEGIVWHCDDGALFKVSHL Ob_Ocbimv22007001m_na_na MVIIVMMMVKLVMSDNDDDDGRGREGGRGEEGQRGEKEEVEKRKEEEEERKRIRKMRKRKRRQRKGRKKKRRRKRKIKKKRRKRKIKRKRRKRKRRKQ
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Hi Kai,
I have the first two perl script running successfully with my test species. I'm now troubleshooting the third perl script. It seems to start up ok using:
perl ../orthored-active/OrthoReD_v20170412/step-05-06.pl --query ./Step-02_QUERY/01_QUERY_QCD/QUERY.fas --q_seq_type AA --database ./Step-04_DATABASE/02_DATABASE/ --db_type AA --spp_list ./01-list-6-species --og 'Schmidtea mediterranea' --blast_type NCBI --vraxml SSE3 --threads 8 --loci_threshold 5
Initial screen output ends:
(Step-06)For each MCL_AA_ALN file, a rooted maximum likelihood tree is generated. RAxML was used to construct the trees. The most distant outgroup that would result in the largest number of ingroup species was selected as the root. When no tips of outgroup is present in the tree, the tree was rooted at midpoint. (Step-07)For each MCL_AA_TRE file, a single-lined FASTA file with orthologs is generated. Before the selection of orthologs, all branches longer than 2 substitutions/site are cut from MCL_AA_TRE file. Any splcie variants present in the database were also included. (Procedure-08)Summarizing the output. (Procedure-09)Formatting directories.
however errors are now being thrown one after another looking like:
sh: line 1: 17116 Aborted blastp -query BLASTER_01_temp_Procedure01_734032694789.fas -db /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly- worm/Step-05_INPUT/DATABASE/NCBI/6-species_AA/6-species_AA -db_soft_mask 21 -use_sw_tback -evalue 1e-03 -max_target_seqs 25000 -outfmt '6 qseqid sseqid evalue pident qcovs length' -num_threads 8 >> BLASTER_01_Procedure01_734032694789. txt
sh: line 1: 17567 Aborted blastp -query BLASTER_01_temp_Procedure01_734032694789.fas -db /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly- worm/Step-05_INPUT/DATABASE/NCBI/6-species_AA/6-species_AA -db_soft_mask 21 -use_sw_tback -evalue 1e-03 -max_target_seqs 25000 -outfmt '6 qseqid sseqid evalue pident qcovs length' -num_threads 8 >> BLASTER_01_Procedure01_734032694789. txt
sh: line 1: 17747 Aborted blastp -query BLASTER_01_temp_Procedure01_734032694789.fas -db /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly- worm/Step-05_INPUT/DATABASE/NCBI/6-species_AA/6-species_AA -db_soft_mask 21 -use_sw_tback -evalue 1e-03 -max_target_seqs 25000 -outfmt '6 qseqid sseqid evalue pident qcovs length' -num_threads 8 >> BLASTER_01_Procedure01_734032694789. txt
sh: line 1: 17892 Aborted blastp -query BLASTER_01_temp_Procedure01_734032694789.fas -db /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly- worm/Step-05_INPUT/DATABASE/NCBI/6-species_AA/6-species_AA -db_soft_mask 21 -use_sw_tback -evalue 1e-03 -max_target_seqs 25000 -outfmt '6 qseqid sseqid evalue pident qcovs length' -num_threads 8 >> BLASTER_01_Procedure01_734032694789. txt
sh: line 1: 18252 Segmentation fault blastp -query BLASTER_01_temp_Procedure01_734032694789.fas -db /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly- worm/Step-05_INPUT/DATABASE/NCBI/6-species_AA/6-species_AA -db_soft_mask 21 -use_sw_tback -evalue 1e-03 -max_target_seqs 25000 -outfmt '6 qseqid sseqid evalue pident qcovs length' -num_threads 8 >> BLASTER_01_Procedure01_734032694789. txt
and in addition the system seems to complain with:
7f9bd4ad5000-7f9bd52d5000 rw-p 00000000 00:00 0 [stack:2066] 7f9be0000000-7f9be0021000 rw-p 00000000 00:00 0 7f9be0021000-7f9be4000000 ---p 00000000 00:00 0 7f9be64de000-7f9be64df000 ---p 00000000 00:00 0 7f9be64df000-7f9be68df000 rw-p 00000000 00:00 0 [stack:2069] 7f9be68df000-7f9be68e0000 ---p 00000000 00:00 0 7f9be68e0000-7f9be6ce0000 rw-p 00000000 00:00 0 [stack:2068] 7f9be6ce0000-7f9be6ce1000 ---p 00000000 00:00 0 7f9be6ce1000-7f9be72e1000 rw-p 00000000 00:00 0 [stack:2067] 7f9be72e1000-7f9be7373000 r--p 00000000 00:ac 6548218 /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly-worm/Step-05_INPUT/DATABASE/NCBI/6-species_AA/6-species_AA.paa 7f9be7373000-7f9be7497000 r--p 00000000 00:ac 6548222 /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly-worm/Step-05_INPUT/DATABASE/NCBI/6-species_AA/6-species_AA.pin 7f9be7497000-7f9be7499000 r-xp 00000000 fd:00 2099592 /usr/lib64/libdl-2.17.so 7f9be7499000-7f9be7699000 ---p 00002000 fd:00 2099592 /usr/lib64/libdl-2.17.so 7f9be7699000-7f9be769a000 r--p 00002000 fd:00 2099592 /usr/lib64/libdl-2.17.so 7f9be769a000-7f9be769b000 rw-p 00003000 fd:00 2099592 /usr/lib64/libdl-2.17.so 7f9be769b000-7f9be76b1000 r-xp 00000000 00:9d 4524650552 /automounts/bioware/bioware/linuxOpteron/gcc-4.9.3/lib64/libgcc_s.so.1 7f9be76b1000-7f9be78b0000 ---p 00016000 00:9d 4524650552 /automounts/bioware/bioware/linuxOpteron/gcc-4.9.3/lib64/libgcc_s.so.1 7f9be78b0000-7f9be78b1000 rw-p 00015000 00:9d 4524650552 /automounts/bioware/bioware/linuxOpteron/gcc-4.9.3/lib64/libgcc_s.so.1 7f9be78b1000-7f9be7a69000 r-xp 00000000 fd:00 2099586 /usr/lib64/libc-2.17.so 7f9be7a69000-7f9be7c69000 ---p 001b8000 fd:00 2099586 /usr/lib64/libc-2.17.so 7f9be7c69000-7f9be7c6d000 r--p 001b8000 fd:00 2099586 /usr/lib64/libc-2.17.so 7f9be7c6d000-7f9be7c6f000 rw-p 001bc000 fd:00 2099586 /usr/lib64/libc-2.17.so 7f9be7c6f000-7f9be7c74000 rw-p 00000000 00:00 0 7f9be7c74000-7f9be7c8b000 r-xp 00000000 fd:00 2099612 /usr/lib64/libpthread-2.17.so 7f9be7c8b000-7f9be7e8a000 ---p 00017000 fd:00 2099612 /usr/lib64/libpthread-2.17.so 7f9be7e8a000-7f9be7e8b000 r--p 00016000 fd:00 2099612 /usr/lib64/libpthread-2.17.so 7f9be7e8b000-7f9be7e8c000 rw-p 00017000 fd:00 2099612 /usr/lib64/libpthread-2.17.so 7f9be7e8c000-7f9be7e90000 rw-p 00000000 00:00 0 7f9be7e90000-7f9be7f91000 r-xp 00000000 fd:00 2099594 /usr/lib64/libm-2.17.so 7f9be7f91000-7f9be8190000 ---p 00101000 fd:00 2099594 /usr/lib64/libm-2.17.so 7f9be8190000-7f9be8191000 r--p 00100000 fd:00 2099594 /usr/lib64/libm-2.17.so 7f9be8191000-7f9be8192000 rw-p 00101000 fd:00 2099594 /usr/lib64/libm-2.17.so 7f9be8192000-7f9be8199000 r-xp 00000000 fd:00 2099616 /usr/lib64/librt-2.17.so 7f9be8199000-7f9be8398000 ---p 00007000 fd:00 2099616 /usr/lib64/librt-2.17.so 7f9be8398000-7f9be8399000 r--p 00006000 fd:00 2099616 /usr/lib64/librt-2.17.so 7f9be8399000-7f9be839a000 rw-p 00007000 fd:00 2099616 /usr/lib64/librt-2.17.so 7f9be839a000-7f9be83b0000 r-xp 00000000 fd:00 2099596 /usr/lib64/libnsl-2.17.so 7f9be83b0000-7f9be85af000 ---p 00016000 fd:00 2099596 /usr/lib64/libnsl-2.17.so 7f9be85af000-7f9be85b0000 r--p 00015000 fd:00 2099596 /usr/lib64/libnsl-2.17.so 7f9be85b0000-7f9be85b1000 rw-p 00016000 fd:00 2099596 /usr/lib64/libnsl-2.17.so 7f9be85b1000-7f9be85b3000 rw-p 00000000 00:00 0 7f9be85b3000-7f9be85c2000 r-xp 00000000 fd:00 2099868 /usr/lib64/libbz2.so.1.0.6 7f9be85c2000-7f9be87c1000 ---p 0000f000 fd:00 2099868 /usr/lib64/libbz2.so.1.0.6 7f9be87c1000-7f9be87c2000 r--p 0000e000 fd:00 2099868 /usr/lib64/libbz2.so.1.0.6 7f9be87c2000-7f9be87c3000 rw-p 0000f000 fd:00 2099868 /usr/lib64/libbz2.so.1.0.6 7f9be87c3000-7f9be87d8000 r-xp 00000000 fd:00 2099774 /usr/lib64/libz.so.1.2.7 7f9be87d8000-7f9be89d7000 ---p 00015000 fd:00 2099774 /usr/lib64/libz.so.1.2.7 7f9be89d7000-7f9be89d8000 r--p 00014000 fd:00 2099774 /usr/lib64/libz.so.1.2.7 7f9be89d8000-7f9be89d9000 rw-p 00015000 fd:00 2099774 /usr/lib64/libz.so.1.2.7 7f9be89d9000-7f9be89fa000 r-xp 00000000 fd:00 2138765 /usr/lib64/ld-2.17.so 7f9be89ff000-7f9be8a00000 ---p 00000000 00:00 0 7f9be8a00000-7f9be8a10000 rw-p 00000000 00:00 0 [stack:2062] 7f9be8a10000-7f9be8aa2000 r--p 00000000 00:ac 6548218 /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly-worm/Step-05_INPUT/DATABASE/NCBI/6-species_AA/6-species_AA.paa 7f9be8aa2000-7f9be8bc6000 r--p 00000000 00:ac 6548222 /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly-worm/Step-05_INPUT/DATABASE/NCBI/6-species_AA/6-species_AA.pin 7f9be8bc6000-7f9be8bce000 rw-p 00000000 00:00 0 7f9be8bd6000-7f9be8bf8000 rw-p 00000000 00:00 0 7f9be8bf8000-7f9be8bfa000 r-xp 00000000 00:00 0 [vdso] 7f9be8bfa000-7f9be8bfb000 r--p 00021000 fd:00 2138765 /usr/lib64/ld-2.17.so 7f9be8bfb000-7f9be8bfc000 rw-p 00022000 fd:00 2138765 /usr/lib64/ld-2.17.so 7f9be8bfc000-7f9be8bfd000 rw-p 00000000 00:00 0 7ffeea479000-7ffeea4a1000 rw-p 00000000 00:00 0 [stack] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Nothing like this happened with the test data - and I'm not sure what to make of it...
Can you please send me the query, subject, and the species list? I can take a look. I am sorry that OrthoReD is giving you so much trouble. It is not supposed to...
Hi Kai,
Thanks for all your patience in assisting me!
I'm not exactly sure what you mean by query and subject but here is a list of the files in Step-01 and Step-03 folders, which shows the names of query and subject files:
[eedsinger@cluster5 orthored-flatworm-seasquirt-octopus-human-fly-worm]$ ls -1 Step-01_Queries_RAW/ proteome_Mollusca_Octopus_bimaculoides.okay.aa
[eedsinger@cluster5 orthored-flatworm-seasquirt-octopus-human-fly-worm]$ ls -1 Step-03_Subjects_RAW/ proteome_Arthropoda_Drosophila_melanogaster.okay.aa proteome_Chordata_Botryllus_schlosseri.okay.aa proteome_Chordata_Homo_sapiens.okay.aa proteome_Nematoda_Caenorhabditis_elegans.okay.aa proteome_Platyhelminthes_Schmidtea_mediterranea.okay.aa
and here is the species list that I input to the perl scripts:
Drosophila melanogaster Dmelanogaster Dm Botryllus schlosseri Bschlosseri Bs Homo sapiens Hsapiens Hs Octopus bimaculoides Obimaculoides Ob Caenorhabditis elegans Celegans Ce Schmidtea mediterranea Smediterranea Sm
Or, in case it was the actual files you wanted I've added them here:
01-list-6-species.txt proteome_Arthropoda_Drosophila_melanogaster.okay.aa.zip proteome_Chordata_Botryllus_schlosseri.okay.aa.zip proteome_Chordata_Homo_sapiens.okay.aa.zip proteome_Mollusca_Octopus_bimaculoides.okay.aa.zip proteome_Nematoda_Caenorhabditis_elegans.okay.aa.zip proteome_Platyhelminthes_Schmidtea_mediterranea.okay.aa.zip
Hi Eric,
I went through the files you sent me and I got it to work on OrthoReD. Two main things:
I am attaching a file that explains what changes that I needed to make in order for OrthoReD to run.
Hope this works.
Kai Battenberg
2018/02/17 10:50、Eric Edsinger notifications@github.comのメール:
Hi Kai,
Thanks for all your patience in assisting me!
I'm not exactly sure what you mean by query and subject but here is a list of the files in Step-01 and Step-03 folders, which shows the names of query and subject files:
[eedsinger@cluster5 orthored-flatworm-seasquirt-octopus-human-fly-worm]$ ls -1 Step-01_Queries_RAW/ proteome_Mollusca_Octopus_bimaculoides.okay.aa
[eedsinger@cluster5 orthored-flatworm-seasquirt-octopus-human-fly-worm]$ ls -1 Step-03_Subjects_RAW/ proteome_Arthropoda_Drosophila_melanogaster.okay.aa proteome_Chordata_Botryllus_schlosseri.okay.aa proteome_Chordata_Homo_sapiens.okay.aa proteome_Nematoda_Caenorhabditis_elegans.okay.aa proteome_Platyhelminthes_Schmidtea_mediterranea.okay.aa
and here is the species list that I input to the perl scripts:
Drosophila melanogaster Dmelanogaster Dm Botryllus schlosseri Bschlosseri Bs Homo sapiens Hsapiens Hs Octopus bimaculoides Obimaculoides Ob Caenorhabditis elegans Celegans Ce Schmidtea mediterranea Smediterranea Sm
Or, in case it was the actual files you wanted I've added them here:
01-list-6-species.txt https://github.com/kbattenb/OrthoReD/files/1734114/01-list-6-species.txt proteome_Arthropoda_Drosophila_melanogaster.okay.aa.zip https://github.com/kbattenb/OrthoReD/files/1734115/proteome_Arthropoda_Drosophila_melanogaster.okay.aa.zip proteome_Chordata_Botryllus_schlosseri.okay.aa.zip https://github.com/kbattenb/OrthoReD/files/1734116/proteome_Chordata_Botryllus_schlosseri.okay.aa.zip proteome_Chordata_Homo_sapiens.okay.aa.zip https://github.com/kbattenb/OrthoReD/files/1734117/proteome_Chordata_Homo_sapiens.okay.aa.zip proteome_Mollusca_Octopus_bimaculoides.okay.aa.zip https://github.com/kbattenb/OrthoReD/files/1734118/proteome_Mollusca_Octopus_bimaculoides.okay.aa.zip proteome_Nematoda_Caenorhabditis_elegans.okay.aa.zip https://github.com/kbattenb/OrthoReD/files/1734119/proteome_Nematoda_Caenorhabditis_elegans.okay.aa.zip proteome_Platyhelminthes_Schmidtea_mediterranea.okay.aa.zip https://github.com/kbattenb/OrthoReD/files/1734120/proteome_Platyhelminthes_Schmidtea_mediterranea.okay.aa.zip — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kbattenb/OrthoReD/issues/3#issuecomment-366470443, or mute the thread https://github.com/notifications/unsubscribe-auth/AO6W0_pdkBeavr0n42nd8K-Ghe-OpLrwks5tVzubgaJpZM4SJEU-.
辻井快/Kai Battenberg Ph.D. UC Davis Plant Sciences kbattenberg@ucdavis.edu
species list: Removed one occasion of a double tab and replaced it with a single tab.
Hs sequences: Changed the "." to a "_" to indicate that it is the splice variant information. Changed the "_na_na" to a "_na" to limit the number of fields to 4 rather than 5. E.g. ">Hs_ENST00000390378_1_na"
Ce sequences: Removed all ".". E.g. ">Ce_AC33_na_na"
Sm sequences: Removed "_dd_Smedv6" from all headers. Of the three fields with numbers, the last one was treated as splice variant while the initial two were combined as a single field. Changed the "_na_na" to a "_na" to limit the number of fields to 4 rather than 5. E.g. ">Sm_10_1_na"
Overall:
Change all the sequence file format from "Western (Mac OS Roam)" to "Unicode (UTS-8)" by re-saving each sequence file with the specific format.
I have never had this problem before. I would recommend that you copy all sequences to a new text file via some text editor (e.g. textwrangler) and re-save it.
If you can get the first line of the file by running "head -n 1
Your suggestions did the trick! Thank-you :)
Now the only thing is that it is running really slow compared to OrthoDB on the same data set - I think due to the limitation of blastp to 8 threads. It seems like this is a limit OrthoReD is imposing on Blastp - If so, is there a reason for this limitation - or is it something I can easily change by going into the perl script?
Hi Eric,
I am glad it worked!!!
So, if everything is working properly, there really shouldn’t be a reason BLASTP cannot work with more than 8 threads. (BLAST 2.4.0+ does not have such a limitation) Can you just try with 80 (which is what you originally had)?
If that causes problems to BLAST, you can add a single line in the perl script to adjust for that.
Kai Battenberg
2018/02/24 8:32、Eric Edsinger notifications@github.comのメール:
Your suggestions did the trick! Thank-you :)
Now the only thing is that it is running really slow compared to OrthoDB on the same data set - I think due to the limitation of blastp to 8 threads. It seems like this is a limit OrthoReD is imposing on Blastp - If so, is there a reason for this limitation - or is it something I can easily change by going into the perl script?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kbattenb/OrthoReD/issues/3#issuecomment-368240342, or mute the thread https://github.com/notifications/unsubscribe-auth/AO6W0_xuKRkUjwCmWnVIlt04s5uIp4iVks5tYDmXgaJpZM4SJEU-.
辻井快/Kai Battenberg Ph.D. UC Davis Plant Sciences kbattenberg@ucdavis.edu
Hi Kai!
I am getting the following error when I run just the first perl script on my own sequences:
[eedsinger@cluster5 orthored-flatworm-seasquirt-octopus-human-fly-worm]$ bash 01-orthored-6-species RUNNING SCRIPT: /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly-worm/OrthoReD_v20170412 step-01-02.pl (Procedure-00)Setting up the environment. Following species are included in the species list: Botryllus schlosseri Caenorhabditis elegans Drosophila melanogaster Homo sapiens Octopus bimaculoides Schmidtea mediterranea Following files are included in the query: ./proteome_Mollusca_Octopus_bimaculoides.okay.aa (Procedure-01)Generate one quality-checked query file in single-line FASTA format. Sequences with undetermined reading frame are removed. ERROR: temp3.fas is not in single-line FASTA format. Check line 1.
The query input file is a single line fasta file (proteome_Mollusca_Octopus_bimaculoides.okay.aa) that I previously structured per your instructions and looks like:
Ob_Ocbimv22002748m_na_na MEPLELEETVIATLLADSEIKSVMTNVIQELPVIAQDTRMKAVGDEYITVSVIRVSVIGVSVIGVSVIRAPVIRVTVTRISFIRVSFIRVSVIGVSVIGVSVIGVSVIRISVIRVSFIRVSVIRVLFIRVSVISLSVWPKMLAPRDYPEYLYPFFPFLFLPFSQECFQKRLQMF Ob_Ocbimv22002861m_na_na MPCSLVIKKEEATRITTVGGDYTAFRPWIDNDSKSLQGTVYVPGKYPIQNMYQNSLNTNQNGLPFVTPPPPVSQDTSRSLFTDIYLTKKNVYGDQVVAPPLPFTLDYQFWYQNPLNKPNSYPENLNFLECNGSIVTSSTPSLPSSSTSSPMLTVNPTSSISSVSSSASSSSTLASVSSISSSYPNSLSTSSTLASSSSSSSSLSSSSSSSSSSSSSFSSNLLSSSQGSSAIPELRSVPDGGSEHLSDNGYNNYGNSNNNIIINNNNNNNNNSISNINSIHVDRQAIHYDILNNNPNNNINNNNNNINNNNNKSCSNNNNSNNNNSNSNNNNTNNSCHLHSSMAAPDSTSSTFECINCNKLFGTPHGLEVHVRRSHTGSRPYACDVCQKTFGHAVSLSHHRSVHTQERTFECQQCGKSFKRSSTLSTHLLIHSDTRPYPCPYCGKRFHQKSDMKKHTYIHTGEKPHRCLQCGKAFSQSSNLITHSRKHTGFKPFACDKCGRAFQRKVDLRRHTETQHLNSSLTKQASLLRVVSVQEHVTSL Ob_Ocbimv22003975m_na_na MDKLCSVQSKLNCIFEIAVTNEPSSKHSNQLYMVSATDKLRPDAKGHNLETALLTEKVDGTCAYVAEFKDRPWLWARHDRKPKKSAEKEFRKFQNEQLDKDATFQWNFEQDFKPFPEHWIPATGVEVKDGVVYPDQNGHTPGWVPIDVNSKQYCWHLESVNLKQGTALLLKETENTALKICLVPLKDILNHTAELIGTSVNGNPYGLGSKKFPFHILIVHGSIKVSYTSEMKRENFLSWMKSDPNGAVEGIVWHCDDGALFKVSHL
temp1.fas looks ok in its structure:
Ob_Ocbimv22002748m_na_na MEPLELEETVIATLLADSEIKSVMTNVIQELPVIAQDTRMKAVGDEYITVSVIRVSVIGVSVIGVSVIRAPVIRVTVTRISFIRVSFIRVSVIGVSVIGVSVIGVSVIRISVIRVSFIRVSVIRVLFIRVSVISLSVWPKMLAPRDYPEYLYPFFPFLFLPFSQECFQKRLQMF Ob_Ocbimv22002861m_na_na MPCSLVIKKEEATRITTVGGDYTAFRPWIDNDSKSLQGTVYVPGKYPIQNMYQNSLNTNQNGLPFVTPPPPVSQDTSRSLFTDIYLTKKNVYGDQVVAPPLPFTLDYQFWYQNPLNKPNSYPENLNFLECNGSIVTSSTPSLPSSSTSSPMLTVNPTSSISSVSSSASSSSTLASVSSISSSYPNSLSTSSTLASSSSSSSSLSSSSSSSSSSSSSFSSNLLSSSQGSSAIPELRSVPDGGSEHLSDNGYNNYGNSNNNIIINNNNNNNNNSISNINSIHVDRQAIHYDILNNNPNNNINNNNNNINNNNNKSCSNNNNSNNNNSNSNNNNTNNSCHLHSSMAAPDSTSSTFECINCNKLFGTPHGLEVHVRRSHTGSRPYACDVCQKTFGHAVSLSHHRSVHTQERTFECQQCGKSFKRSSTLSTHLLIHSDTRPYPCPYCGKRFHQKSDMKKHTYIHTGEKPHRCLQCGKAFSQSSNLITHSRKHTGFKPFACDKCGRAFQRKVDLRRHTETQHLNSSLTKQASLLRVVSVQEHVTSL Ob_Ocbimv22003975m_na_na MDKLCSVQSKLNCIFEIAVTNEPSSKHSNQLYMVSATDKLRPDAKGHNLETALLTEKVDGTCAYVAEFKDRPWLWARHDRKPKKSAEKEFRKFQNEQLDKDATFQWNFEQDFKPFPEHWIPATGVEVKDGVVYPDQNGHTPGWVPIDVNSKQYCWHLESVNLKQGTALLLKETENTALKICLVPLKDILNHTAELIGTSVNGNPYGLGSKKFPFHILIVHGSIKVSYTSEMKRENFLSWMKSDPNGAVEGIVWHCDDGALFKVSHL
but temp2.fas and temp3.fas both seem to lack newline characters and look like this (temp3.fas):
OB_OCBIMV22002748M_NA_NAMEPLELEETVIATLLADSEIKSVMTNVIQELPVIAQDTRMKAVGDEYITVSVIRVSVIGVSVIGVSVIRAPVIRVTVTRISFIRVSFIRVSVIGVSVIGVSVIGVSVIRISVIRVSFIRVSVIRVLFIRVSVISLSVWPKMLAPRDYPEYLYPFFPFLFLPFSQECFQKRLQMFOB_OCBIMV22002861M_NA_NAMPCSLVIKKEEATRITTVGGDYTAFRPWIDNDSKSLQGTVYVPGKYPIQNMYQNSLNTNQNGLPFVTPPPPVSQDTSRSLFTDIYLTKKNVYGDQVVAPPLPFTLDYQFWYQNPLNKPNSYPENLNFLECNGSIVTSSTPSLPSSSTSSPMLTVNPTSSISSVSSSASSSSTLASVSSISSSYPNSLSTSSTLASSSSSSSSLSSSSSSSSSSSSSFSSNLLSSSQGSSAIPELRSVPDGGSEHLSDNGYNNYGNSNNNIIINNNNNNNNNSISNINSIHVDRQAIHYDILNNNPNNNINNNNNNINNNNNKSCSNNNNSNNNNSNSNNNNTNNSCHLHSSMAAPDSTSSTFECINCNKLFGTPHGLEVHVRRSHTGSRPYACDVCQKTFGHAVSLSHHRSVHTQERTFECQQCGKSFKRSSTLSTHLLIHSDTRPYPCPYCGKRFHQKSDMKKHTYIHTGEKPHRCLQCGKAFSQSSNLITHSRKHTGFKPFACDKCGRAFQRKVDLRRHTETQHLNSSLTKQASLLRVVSVQEHVTSLOB_OCBIMV22003975M_NA_NAMDKLCSVQSKLNCIFEIAVTNEPSSKHSNQLYMVSATDKLRPDAKGHNLETALLTEKVDGTCAYVAEFKDRPWLWARHDRKPKKSAEKEFRKFQNEQLDKDATFQWNFEQDFKPFPEHWIPATGVEVKDGVVYPDQNGHTPGWVPIDVNSKQYCWHLESVNLKQGTALLLKETENTALKICLVPLKDILNHTAELIGTSVNGNPYGLGSKKFPFHILIVHGSIKVSYTSEMKRENFLSWMKSDPNGAVEGIVWHCDDGALFKVSHLOB_OCBIMV22007001M_NA_NAMVIIVMMMVKLVMSDNDDDDGRGREGGRGEEGQRGEKEEVEKRKEEEEERKRIRKMRKRKRRQRKGRKKKRRRKRKIKKKRRKRKIKRKRRKRKRRKQOB_OCBIMV22007914M_NA_NAMRDEMANLYKKTHPWSYIPWNIIDLPFLYKEPPQKTLPDFSNDEIYFDVGLGKRGLPFSLDIAKRREKPLLTKGLLIKKLELLAQRAHHLELPEDNDKRRQSLAHSOB_OCBIMV22008886M_NA_NAMTSLRLFVVLTVVPSIIFVLSTTLVDSASHENRRLSILVFGGNGFIGSATVSRLLKTDHSITIINRGNWYWDSNVLIKPYVRHLKCDRMQSLYNCQDLVDFFKTSSSIYFDAIIDFSAYHPFAVREVLAIFRSKVGLYVLISTDSVYDVCMKNHTAPSKETDAVRPFAETMREDYAKNDNYGHLKLQCEEELQQQPEADKIPFLIYRLPDVIGPKDNTYRWWLYQLWMKIRTYLERPVSLPANLVKQEMSLVYVEDVADIIVQYLTGSEDINNEAYNLAIDETPTLYEVLSDIKDSLNLTDLNIFIEPLTSSSIYLFPSVKLGPVDVSKAKEKLNWKPTSWEKILKEIIAFYEKAIKEEKYEIPRRDVIHMMQKHLTRRPLQVLTGLRAVYGIDYPFVKEELOB_OCBIMV22008960M_NA_NAMNGVIKLRGRQRKERERERERERERERERERERERERERERERAKGGEWTKRRKKKQTRLREIKNERRRKERKKWRYFMSESESRGTRILEOB_OCBIMV22009214M_NA_NAMATDPLNMAGRNGIVVPSTSTRQAIAKSFQLEYCWFCGRPMDFFSLSSDNDHMEKLSELERLLAQAQNEKMHLIDEQVKQRESEMVALQEERLKREELERKLQEEALLREQLVQQQVQLREKQIQQARPLTRYLPIRNKDFDLRQHIEAAGHNLDSCPLVIVTMTSCRGYLQKMGSKFKTWHKRWFFFDRMKRSLLYYSDKNETKARGGIYFQAIEEVYVDHLRTVKSPNPKLTFCVKTYDRTYYLVGPSAEAMRIWIDVIFTGAEGYHTFOB_OCBIMV22009911M_NA_NAKKKVKTLHLILIIKSIVTVTHITTCRGSSCCTYNNDSNYCCDSNNAPNDTNNNRYFDKLSCTIVSFLCACCITSSGQCVCFIGVYNSSNACWAKAEKGDENCRKHLIFIVLLWLAVNNNSLLLLLWLLLDWISOB_OCBIMV22010443M_NA_NAMRRIHAITRSASKTKKPKYHDFLKNCLAEKFTKDIRRMFRNLENSETMARKGVMKRETOB_OCBIMV22011209M_NA_NAMNNFNYQSGKRNSQNAGVRRRGRKQLHEIQNNFNDDIQILEVSLPKRGPVKERPIEKIDLGVKENYSPPTKAIDQCSSEVPQVNYQRDDKLEISFLEGDRAEKSLDLGWESQEIKRNILLKLATDDDYSFTKTEECFIHERAKLYVVNMFPEKHYYYIPDNAEEQPQSIDSDEEILNEGQVFFPTSYSGKRKRRSYNNFNCYVTHHKNDDLYDNSHILPKGTRLLKPPNKLVELVAQAIDGSPDGLLQVHQIYTVLQNKYPYFRFMDRMAINSWRSSIRHALYQKWFRKIHFSTESINRKGCYWTINRQFSPKTWTMPGFQNISSVYFTTDSTENCTSNFQPETQLPEVEAPGPNNSITYEQCNSSNEFQPVINMPQEYSIAIRDDQQNEDEITTLIKDPDPILIPWSSESVTSVGHIENAALLSKDSQMSQLPISFQMNTVIDCEEMLNASPDTWLQQCSGSSDWPQNVDDASLCYPYTQQILLATFPLATSLANTOB_OCBIMV22011909M_NA_NAMVDSQCTVKELDQWIKQLYECNRLSETQVKTLCEKAKEILSQESNVQQVKCPVTVCGDIHGQFHDLMELFSIGGRIPDTNYLFMGDYVDRGYYSVETITLLVVLKVRYKDRITILRGNHESRQITQVYGFYDECLRKYGNANVWKFFTDLFDFLPLTALVDEEIFCLHGGLSPSICNLDNINTLFRKQEIPHEGPMCDLLWSDPGTHNGWSLSPRGLGYTFGKCISELFCYSNGLSLISRAHQLAMESCHFYFQFGVLGDIFNLFTHNIHVIWYGYTNDLTASVSLIQYDNIRLMVFGYMIRLKIIAPEYRYFSIFLOB_OCBIMV22012235M_NA_NAMVRRIHLISDELTEQLNTVSKKFAWFSLALDKSTDNQDTVQLRIFVRGIDENFVITEKLLVLESMKNTTTGQDLFECAVDCVEKSAVSWNRMASITTDGARVFTGKNVGMIKLLENKLKAEHPDGDILPFHYILRQKSFCKPALDLKHVVNPVMSMVNTIRIRAFYHLQFKSHLEDMEAQYGDVIYHNSVRWHVRSFKTKLGLFARKQKVPASVSSKIRDHWLSLEDEVTRRFQDFKKIEPDLNLLSYPLTADIDTAPEEVQLELIDMHFSCMKINEOB_OCBIMV22014287M_NA_NATPPPPVAVYNTEDGEEKNETVQSPIVIAVTATAVTVTNAAAATLIAAAVLVVVTVTPSTKETTTAIATTTTTMIAKVTTATTTKPIPTTTITTVAAAATTTTTRTLPTTKTLPQTQTTWIQEQTRTQRSKTADTTVTAHTDTTOB_OCBIMV22014735M_NA_NAMKYRITSRNCTYIDTHTESVETINKYTEEHTHQHTHTAFVERAINETMYKHTHTHNWIWKHSHTYSSMYTHTHTHTQPNTLTHIQVCTTSHQEVLTFIOB_OCBIMV22016260M_NA_NA
Folders and files produced include:
01_QUERY_QCD (empty) step-01-02_LOG.txt Step-01_Queries_RAW (contains a copy of the query input fasta file) Step-02_QUERY (empty) temp1.fas temp2.fas temp3.fas
the log text file that was produced reads:
RUNNING SCRIPT: /automounts/workspace/workspace/eedsinger/088_invertebrate-immune-systems/orthored/orthored-flatworm-seasquirt-octopus-human-fly-worm/OrthoReD_v20170412 step-01-02.pl (Procedure-00)Setting up the environment. Following species were included in the species list: Botryllus schlosseri Caenorhabditis elegans Drosophila melanogaster Homo sapiens Octopus bimaculoides Schmidtea mediterranea Following files were included in the query: ./proteome_Mollusca_Octopus_bimaculoides.okay.aa (Procedure-01)Generate one quality-checked query file in single-line FASTA format. Sequences with undetermined reading frame were removed. step-01-02_LOG.txt (END)
And the command line run was:
perl ./OrthoReD_v20170412/step-01-02.pl --query ./proteome_Mollusca_Octopus_bimaculoides.okay.aa --q_seq_type AA --spp_list ./01-list-6-species
Any ideas or suggestions on what to correct would be greatly appreciated!