We are trying to run Topiary with sequences that seem to have no paralogs in most clades, but that might be named differently according to the species. So our database, which has Opisthokonts as the scope, has a sequence from yeast and one from humans. Even though they are named differently (RQC1_YEAST and TCF25_HUMAN), we believe they should be orthologs as there are virtually no species among model species with more than one sequence containing the same domain (PF04910 on Pfam).
How do we prepare the input seed in this case? We tried both using two sequences, each one with their own aliases or using all aliases from both sequences on the two entries, but after the reciprocal blast generates a 4675 sequence alignment for the 02_recip-blast-dataframe.csv , the shrunk dataframe is reduced to just one sequence, and then seed-to-alignment stops on the Aligning sequences step with the following error
muscle 5.1.linux64 [] 7.6Gb RAM, 4 cores
Built May 16 2023 07:53:40
(C) Copyright 2004-2021 Robert C. Edgar.
https://drive5.com
Input: 1 seqs, avg length 676, max 676
double free or corruption (out)
Traceback (most recent call last):
File "/home/amandacpa/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper
value = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/amandacpa/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/pipeline/seed_to_alignment.py", line 478, in seed_to_alignment
df = topiary.muscle.align(df)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/amandacpa/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/muscle/muscle.py", line 96, in align
_run_muscle(input_fasta,output_fasta,super5,silent,muscle_cmd_args,muscle_binary)
File "/home/amandacpa/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/muscle/muscle.py", line 216, in _run_muscle
raise subprocess.CalledProcessError(return_code, cmd)
subprocess.CalledProcessError: Command '['muscle', '-align', 'topiary-tmp_dULdoeuPiV_align-in.fasta', '-output', 'topiary-tmp_dULdoeuPiV_align-out.fasta']' died with <Signals.SIGABRT: 6>.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/amandacpa/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/wrap.py", line 185, in wrap_function
ret = fcn(**fcn_args.dict)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/amandacpa/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper
raise WrappedFunctionException(err) from e
topiary._private.interface.WrappedFunctionException:
Caught exception in function 'seed_to_alignment'. Returning to starting
directory and cleaning up. Check error stack for cause of
this error.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/amandacpa/miniconda3/envs/topiary/bin/topiary-seed-to-alignment", line 26, in
main()
File "/home/amandacpa/miniconda3/envs/topiary/bin/topiary-seed-to-alignment", line 21, in main
wrap_function(seed_to_alignment,
File "/home/amandacpa/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/wrap.py", line 189, in wrap_function
raise RuntimeError(err) from e
RuntimeError:
Function seed_to_alignment raised an error.
==================
This is the latest seed file we used which caused the error above
species,name,aliases,sequence,accession
Homo sapiens,RQC1,TCF25;TCF-25;Nuclear localized protein 1;KIAA1049;NULP1;FKSG26;RQC1;YDR333C,MSRRALRRLRGEQRGQEPLGPGALHFDLRDDDDAEEEGPKRELGVRRPGGAGKEGVRVNNRFELINIDDLEDDP
VVNGERSGCALTDAVAPGNKGRGQRGNTESKTDGDDTETVPSEQSHASGKLRKKKKKQKNKKSSTGEASENGLEDIDRILERIEDSTGLNRPGPAPLSSRKHVLYVEHRHLNPDTELKRYFGARAILGEQRPRQRQRVYPKCTWLTTPKSTWPRYSKPGLSMRLLESK
KGLSFFAFEHSEEYQQAQHKFLVAVESMEPNNIVVLLQTSPYHVDSLLQLSDACRFQEDQEMARDLVERALYSMECAFHPLFSLTSGACRLDYRRPENRSFYLALYKQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCMLLLIDHLALRARNYEYLIRLFQEWEAHR
NLSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQKASLLIQQALTMFPGVLLPLLESCSVRPDASVSSHRFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPATMSWLEENVHEVLQAVDAGDPAVEACENRRKVLYQRAPRNIHRHVILSEIKEAVAALPPDVTTQ
SVMGFDPLPPSDTIYSYVRPERLSPISHGNTIALFFRSLLPNYTMEGERPEEGVAGGLNRNQGLNRLMLAVRDMMANFHLNDLEAPHEDDAEGEGEWD,Q9BQ70
Saccharomyces cerevisiae,RQC1,TCF25;TCF-25;Nuclear localized protein 1;KIAA1049;NULP1;FKSG26;RQC1;YDR333C,MSSRALRRLQDDNALLESLLSNSNANKMTSGKSTAGNIQKRENIFSMMNNVRDSDNSTDEGQ
MSEQDEEAAAAGERDTQSNGQPKRITLASKSSRRKKNKKAKRKQKNHTAEAAKDKGSDDDDDDEEFDKIIQQFKKTDILKYGKTKNDDTNEEGFFTASEPEEASSQPWKSFLSLESDPGFTKFPISCLRHSCKFFQNDFKKLDPHTEFKLLFDDISPESLEDIDSMTS
TPVSPQQLKQIQRLKRLIRNWGGKDHRLAPNGPGMHPQHLKFTKIRDDWIPTQRGELSMKLLSSDDLLDWQLWERPLDWKDVIQNDVSQWQKFISFYKFEPLNSDLSKKSMMDFYLSVIVHPDHEALINLISSKFPYHVPGLLQVALIFIRQGDRSNTNGLLQRALFV
FDRALKANIIFDSLNCQLPYIYFFNRQFYLAIFRYIQSLAQRGVIGTASEWTKVLWSLSPLEDPLGCRYFLDHYFLLNNDYQYIIELSNSPLMNCYKQWNTLGFSLAVVLSFLRINEMSSARNALLKAFKHHPLQLSELFKEKLLGDHALTKDLSIDGHSAENLELKA
YMARFPLLWNRNEEVTFLHDEMSSILQDYHRGNVTIDSNDGQDHNNINNLQSPFFIAGIPINLLRFAILSEESSVMAAIPSFIWSDNEVYEFDVLPPMPTSKESIEVVENIKTFINEKDLAVLQAERMQDEDLLNQIRQISLQQYIHENEESNENEG,Q05468
We are trying to run Topiary with sequences that seem to have no paralogs in most clades, but that might be named differently according to the species. So our database, which has Opisthokonts as the scope, has a sequence from yeast and one from humans. Even though they are named differently (RQC1_YEAST and TCF25_HUMAN), we believe they should be orthologs as there are virtually no species among model species with more than one sequence containing the same domain (PF04910 on Pfam). How do we prepare the input seed in this case? We tried both using two sequences, each one with their own aliases or using all aliases from both sequences on the two entries, but after the reciprocal blast generates a 4675 sequence alignment for the 02_recip-blast-dataframe.csv , the shrunk dataframe is reduced to just one sequence, and then seed-to-alignment stops on the Aligning sequences step with the following error
muscle 5.1.linux64 [] 7.6Gb RAM, 4 cores Built May 16 2023 07:53:40 (C) Copyright 2004-2021 Robert C. Edgar. https://drive5.com
Input: 1 seqs, avg length 676, max 676
double free or corruption (out) Traceback (most recent call last): File "/home/amandacpa/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper value = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/amandacpa/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/pipeline/seed_to_alignment.py", line 478, in seed_to_alignment df = topiary.muscle.align(df) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/amandacpa/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/muscle/muscle.py", line 96, in align _run_muscle(input_fasta,output_fasta,super5,silent,muscle_cmd_args,muscle_binary) File "/home/amandacpa/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/muscle/muscle.py", line 216, in _run_muscle raise subprocess.CalledProcessError(return_code, cmd) subprocess.CalledProcessError: Command '['muscle', '-align', 'topiary-tmp_dULdoeuPiV_align-in.fasta', '-output', 'topiary-tmp_dULdoeuPiV_align-out.fasta']' died with <Signals.SIGABRT: 6>.
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/amandacpa/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/wrap.py", line 185, in wrap_function ret = fcn(**fcn_args.dict) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/amandacpa/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper raise WrappedFunctionException(err) from e topiary._private.interface.WrappedFunctionException:
Caught exception in function 'seed_to_alignment'. Returning to starting directory and cleaning up. Check error stack for cause of this error.
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/amandacpa/miniconda3/envs/topiary/bin/topiary-seed-to-alignment", line 26, in
main()
File "/home/amandacpa/miniconda3/envs/topiary/bin/topiary-seed-to-alignment", line 21, in main
wrap_function(seed_to_alignment,
File "/home/amandacpa/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/wrap.py", line 189, in wrap_function
raise RuntimeError(err) from e
RuntimeError:
Function seed_to_alignment raised an error.
==================
This is the latest seed file we used which caused the error above
species,name,aliases,sequence,accession Homo sapiens,RQC1,TCF25;TCF-25;Nuclear localized protein 1;KIAA1049;NULP1;FKSG26;RQC1;YDR333C,MSRRALRRLRGEQRGQEPLGPGALHFDLRDDDDAEEEGPKRELGVRRPGGAGKEGVRVNNRFELINIDDLEDDP VVNGERSGCALTDAVAPGNKGRGQRGNTESKTDGDDTETVPSEQSHASGKLRKKKKKQKNKKSSTGEASENGLEDIDRILERIEDSTGLNRPGPAPLSSRKHVLYVEHRHLNPDTELKRYFGARAILGEQRPRQRQRVYPKCTWLTTPKSTWPRYSKPGLSMRLLESK KGLSFFAFEHSEEYQQAQHKFLVAVESMEPNNIVVLLQTSPYHVDSLLQLSDACRFQEDQEMARDLVERALYSMECAFHPLFSLTSGACRLDYRRPENRSFYLALYKQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCMLLLIDHLALRARNYEYLIRLFQEWEAHR NLSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQKASLLIQQALTMFPGVLLPLLESCSVRPDASVSSHRFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPATMSWLEENVHEVLQAVDAGDPAVEACENRRKVLYQRAPRNIHRHVILSEIKEAVAALPPDVTTQ SVMGFDPLPPSDTIYSYVRPERLSPISHGNTIALFFRSLLPNYTMEGERPEEGVAGGLNRNQGLNRLMLAVRDMMANFHLNDLEAPHEDDAEGEGEWD,Q9BQ70 Saccharomyces cerevisiae,RQC1,TCF25;TCF-25;Nuclear localized protein 1;KIAA1049;NULP1;FKSG26;RQC1;YDR333C,MSSRALRRLQDDNALLESLLSNSNANKMTSGKSTAGNIQKRENIFSMMNNVRDSDNSTDEGQ MSEQDEEAAAAGERDTQSNGQPKRITLASKSSRRKKNKKAKRKQKNHTAEAAKDKGSDDDDDDEEFDKIIQQFKKTDILKYGKTKNDDTNEEGFFTASEPEEASSQPWKSFLSLESDPGFTKFPISCLRHSCKFFQNDFKKLDPHTEFKLLFDDISPESLEDIDSMTS TPVSPQQLKQIQRLKRLIRNWGGKDHRLAPNGPGMHPQHLKFTKIRDDWIPTQRGELSMKLLSSDDLLDWQLWERPLDWKDVIQNDVSQWQKFISFYKFEPLNSDLSKKSMMDFYLSVIVHPDHEALINLISSKFPYHVPGLLQVALIFIRQGDRSNTNGLLQRALFV FDRALKANIIFDSLNCQLPYIYFFNRQFYLAIFRYIQSLAQRGVIGTASEWTKVLWSLSPLEDPLGCRYFLDHYFLLNNDYQYIIELSNSPLMNCYKQWNTLGFSLAVVLSFLRINEMSSARNALLKAFKHHPLQLSELFKEKLLGDHALTKDLSIDGHSAENLELKA YMARFPLLWNRNEEVTFLHDEMSSILQDYHRGNVTIDSNDGQDHNNINNLQSPFFIAGIPINLLRFAILSEESSVMAAIPSFIWSDNEVYEFDVLPPMPTSKESIEVVENIKTFINEKDLAVLQAERMQDEDLLNQIRQISLQQYIHENEESNENEG,Q05468