Open CongLiu37 opened 1 year ago
Aligning separately within GALBA is already possible. If you provide several protein file names comma separated, the files will be aligned one-by-one. It’s not elegant, the index is re-built every time. Other parts of the pipeline may be more RAM critical.
I will not add feeding precomputed alignments into GALBA as a command line option.
Cong Liu @.***> schrieb am Mi. 1. März 2023 um 01:49:
Hello,
I am wondering if it is possible to run GALBA with pre-computed miniprot alignments, or make GALBA accept multiple protein files and call miniprot for these files one by one? I have limited memory and it is difficult to run miniprot with all proteins in a single file.
Sincerely,
Cong
— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/GALBA/issues/8, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JHCQ2IZHEVZKLWY6KDWZ2MIXANCNFSM6AAAAAAVLM5EB4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Keep in mind that GALBA today is a pipeline for exactly one reference protein set. We will continue to expand functionality for large protein input, but currently, using one specifies protein set is your safest bet.
Katharina Hoff @.***> schrieb am Mi. 1. März 2023 um 08:37:
Aligning separately within GALBA is already possible. If you provide several protein file names comma separated, the files will be aligned one-by-one. It’s not elegant, the index is re-built every time. Other parts of the pipeline may be more RAM critical.
I will not add feeding precomputed alignments into GALBA as a command line option.
Cong Liu @.***> schrieb am Mi. 1. März 2023 um 01:49:
Hello,
I am wondering if it is possible to run GALBA with pre-computed miniprot alignments, or make GALBA accept multiple protein files and call miniprot for these files one by one? I have limited memory and it is difficult to run miniprot with all proteins in a single file.
Sincerely,
Cong
— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/GALBA/issues/8, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JHCQ2IZHEVZKLWY6KDWZ2MIXANCNFSM6AAAAAAVLM5EB4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hello,
I just tested multiple protein files, but it did not work.
galba.pl --genome=genome.fa --prot_seq=proteins1.fa,proteins2.fa --skipOptimize --threads 8
#**********************************************************************************
# GALBA CONFIGURATION
#**********************************************************************************
# GALBA CALL: /home/c/c-liu/Softwares/GALBA/scripts/galba.pl --genome=genome.fa --prot_seq=proteins1.fa,proteins2.fa --skipOptimize --threads 8
# Wed Mar 1 16:53:15 2023: galba.pl version 1.0.1
# Wed Mar 1 16:53:15 2023: Configuring of GALBA for using external tools...
# Wed Mar 1 16:53:15 2023: Found environment variable $AUGUSTUS_CONFIG_PATH. Setting $AUGUSTUS_CONFIG_PATH to /home/c/c-liu/Softwares/Augustus/config/
# Wed Mar 1 16:53:15 2023: Found environment variable $AUGUSTUS_BIN_PATH. Setting $AUGUSTUS_BIN_PATH to /apps/unit/BioinfoUgrp/DebianMed/11.2/modules/augustus/3.4.0+dfsg2-2/bin/
# Wed Mar 1 16:53:15 2023: Found environment variable $AUGUSTUS_SCRIPTS_PATH. Setting $AUGUSTUS_SCRIPTS_PATH to /home/c/c-liu/Softwares/Augustus/scripts/
# Wed Mar 1 16:53:15 2023: Found environment variable $PYTHON3_PATH. Setting $PYTHON3_PATH to /home/c/c-liu/miniconda3/bin/
# Wed Mar 1 16:53:15 2023: Found environment variable $DIAMOND_PATH. Setting $DIAMOND_PATH to /apps/unit/BioinfoUgrp/Other/DIAMOND/2.0.4.142/
# Wed Mar 1 16:53:15 2023: Found environment variable $MINIPROT_PATH. Setting $GMINIPROT_PATH to /home/c/c-liu/Softwares/miniprot/
# Wed Mar 1 16:53:15 2023: ERROR: in file /home/c/c-liu/Softwares/GALBA/scripts/galba.pl at line 541
GALBA does currently not support using multiple protein input files with Miniprot as an aligner. Please combine your protein fasta files into a single file before starting GALBA.
Sincerely,
Cong
Ah, then disabled it because it would rebuild the index. I am writing from my phone. It’s not hard to reverse this change but I am not convinced the alignment step is the memory critical step. I will measure RAM consumption this or next week.
Cong Liu @.***> schrieb am Mi. 1. März 2023 um 09:05:
Hello,
I just tested multiple protein files, but it did not work.
galba.pl --genome=genome.fa --prot_seq=proteins1.fa,proteins2.fa --skipOptimize --threads 8
**
GALBA CONFIGURATION
**
GALBA CALL: /home/c/c-liu/Softwares/GALBA/scripts/galba.pl --genome=genome.fa --prot_seq=proteins1.fa,proteins2.fa --skipOptimize --threads 8
Wed Mar 1 16:53:15 2023: galba.pl version 1.0.1
Wed Mar 1 16:53:15 2023: Configuring of GALBA for using external tools...
Wed Mar 1 16:53:15 2023: Found environment variable $AUGUSTUS_CONFIG_PATH. Setting $AUGUSTUS_CONFIG_PATH to /home/c/c-liu/Softwares/Augustus/config/
Wed Mar 1 16:53:15 2023: Found environment variable $AUGUSTUS_BIN_PATH. Setting $AUGUSTUS_BIN_PATH to /apps/unit/BioinfoUgrp/DebianMed/11.2/modules/augustus/3.4.0+dfsg2-2/bin/
Wed Mar 1 16:53:15 2023: Found environment variable $AUGUSTUS_SCRIPTS_PATH. Setting $AUGUSTUS_SCRIPTS_PATH to /home/c/c-liu/Softwares/Augustus/scripts/
Wed Mar 1 16:53:15 2023: Found environment variable $PYTHON3_PATH. Setting $PYTHON3_PATH to /home/c/c-liu/miniconda3/bin/
Wed Mar 1 16:53:15 2023: Found environment variable $DIAMOND_PATH. Setting $DIAMOND_PATH to /apps/unit/BioinfoUgrp/Other/DIAMOND/2.0.4.142/
Wed Mar 1 16:53:15 2023: Found environment variable $MINIPROT_PATH. Setting $GMINIPROT_PATH to /home/c/c-liu/Softwares/miniprot/
Wed Mar 1 16:53:15 2023: ERROR: in file /home/c/c-liu/Softwares/GALBA/scripts/galba.pl at line 541
GALBA does currently not support using multiple protein input files with Miniprot as an aligner. Please combine your protein fasta files into a single file before starting GALBA.
Sincerely,
Cong
— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/GALBA/issues/8#issuecomment-1449522511, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JHMAKE5REG4XW4IIRDWZ37NTANCNFSM6AAAAAAVLM5EB4 . You are receiving this because you commented.Message ID: @.***>
Thank you for your feedback!
Sincerely,
Cong
Hello,
I am wondering if it is possible to run GALBA with pre-computed miniprot alignments, or make GALBA accept multiple protein files and call miniprot for these files one by one? I have limited memory and it is difficult to run miniprot with all proteins in a single file.
Sincerely,
Cong