Open spoonbender76 opened 2 months ago
Let me try with a separate user because I am normally a sudo user on the system so didn't realize this.
You may be able to do --writable-tmpfs instead of overlay. I will check in the morning though to be sure.
On Thu, Apr 11, 2024, 9:21 PM spoonbender76 @.***> wrote:
I am trying the new big singularity image and from the document
- --overlay $(pwd)/tempdir will trigger
FATAL:container creation failed:while setting overlay session layout: only root user can use sandbox overlay in setuid mode
still require root for singularity users
- Even with -w miniprot It shows
chosen protein algos: None
And I don't know this error can be ignored or have any affects.
WARN: Unknown directive runOptions for process pasa [b7/9cbcde] NOTE: Process pasa(1) terminated with an error exit status(2)
- Error is ignored
image.png (view on web) https://github.com/formbio/FLAG/assets/109210499/0062f413-7ee7-443e-996b-4565662c3cde
If Liftoff is desired the above command can be modified such as below:
singularity run --bind $(pwd):/data --bind $(pwd)/tempdir:/tmp \ --overlay $(pwd)/tempdir singularity_flag.image \ -g Erynnis_tages-GCA_905147235.1-softmasked.fa -r curatedButterflyRNA.fa \ -p curatedButterflyProteins.fa -f GCF_009731565.1_Dplex_v4_genomic.fa \ -a GCF_009731565.1_Dplex_v4_genomic.gff -m skip -t true \ -l lepidoptera_odb10 \ -z Helixer,helixer_trained_augustus -q vertebrate -s small -n Eynnis_tages \ -w miniprot -y normal -p singularity -o outputdir -u singularity
— Reply to this email directly, view it on GitHub https://github.com/formbio/FLAG/issues/17, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHT22XZEIIEMGMNBYMBSVRTY45ALPAVCNFSM6AAAAABGDKRE3WVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIZTQOJRGM3TINQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Pasa is failing because it's trying to write a sqlite file is my guess related to the overlay. The warning you have for it is fine.
Even if it does fail you should get good results since Splign will also do transcript to genome alignments but it's slightly better with Pasa working.
On Thu, Apr 11, 2024, 10:44 PM William Troy @.***> wrote:
Let me try with a separate user because I am normally a sudo user on the system so didn't realize this.
You may be able to do --writable-tmpfs instead of overlay. I will check in the morning though to be sure.
On Thu, Apr 11, 2024, 9:21 PM spoonbender76 @.***> wrote:
I am trying the new big singularity image and from the document
- --overlay $(pwd)/tempdir will trigger
FATAL:container creation failed:while setting overlay session layout: only root user can use sandbox overlay in setuid mode
still require root for singularity users
- Even with -w miniprot It shows
chosen protein algos: None
And I don't know this error can be ignored or have any affects.
WARN: Unknown directive runOptions for process pasa [b7/9cbcde] NOTE: Process pasa(1) terminated with an error exit status(2) - - Error is ignored
image.png (view on web) https://github.com/formbio/FLAG/assets/109210499/0062f413-7ee7-443e-996b-4565662c3cde
If Liftoff is desired the above command can be modified such as below:
singularity run --bind $(pwd):/data --bind $(pwd)/tempdir:/tmp \ --overlay $(pwd)/tempdir singularity_flag.image \ -g Erynnis_tages-GCA_905147235.1-softmasked.fa -r curatedButterflyRNA.fa \ -p curatedButterflyProteins.fa -f GCF_009731565.1_Dplex_v4_genomic.fa \ -a GCF_009731565.1_Dplex_v4_genomic.gff -m skip -t true \ -l lepidoptera_odb10 \ -z Helixer,helixer_trained_augustus -q vertebrate -s small -n Eynnis_tages \ -w miniprot -y normal -p singularity -o outputdir -u singularity
— Reply to this email directly, view it on GitHub https://github.com/formbio/FLAG/issues/17, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHT22XZEIIEMGMNBYMBSVRTY45ALPAVCNFSM6AAAAABGDKRE3WVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIZTQOJRGM3TINQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
the protein algo has been fixed as well as instructions for singularity with liftoff. That was a typo thank you for catching those!
I setup Docker according to instructions and I've been running Docker example for over 15 hours, but I don't see any signs of it running. There are no splign/augustus processes visible. I have to CTRL-C since it appears to be stalled.
nextflow run main.nf -w workdir/ --output outputdir/ \
--genome examples/Erynnis_tages-GCA_905147235.1-softmasked.fa --rna examples/curatedButterflyRNA.fa \
--proteins examples/curatedButterflyProteins.fa --fafile examples/GCF_009731565.1_Dplex_v4_genomic.fa \
--gtffile examples/GCF_009731565.1_Dplex_v4_genomic.gff --masker skip --transcriptIn true \
--lineage lepidoptera_odb10 --annotationalgo Liftoff,Helixer,helixer_trained_augustus \
--helixerModel invertebrate --externalalgo input_transcript,input_proteins --size small --proteinalgo miniprot \
--speciesScientificName Eynnis_tages \
--funcAnnotProgram eggnog --eggnogDB eggnogDB.tar.gz -profile docker
By the way, the flags --fafile
and --gtffile
appear to be specified twice in Example Docker Run commands
If Liftoff is desired the above command can be modified such as below:
nextflow run main.nf -w workdir/ --output outputdir/ \ --genome examples/Erynnis_tages-GCA905147235.1-softmasked.fa --rna examples/curatedButterflyRNA.fa \ --proteins examples/curatedButterflyProteins.fa **--fafile examples/GCF_009731565.1_Dplex_v4genomic.fa \ _--gtffile examples/GCF_009731565.1_Dplex_v4genomic.gff --masker skip --transcriptIn true \ --lineage lepidoptera_odb10 --annotationalgo Liftoff,Helixer,helixer_trained_augustus \ --helixerModel invertebrate --externalalgo input_transcript,input_proteins --size small --proteinalgo miniprot \ --speciesScientificName Eynnis_tages --fafile examples/monarchGenome.fa --gtffile examples/monarchAnnotation.gff3** \ --funcAnnotProgram eggnog --eggnogDB eggnogDB.tar.gz -profile docker
nextflow run main.nf -w workdir/ --output outputdir/ \ --genome examples/Erynnis_tages-GCA905147235.1-softmasked.fa --rna examples/curatedButterflyRNA.fa \ --proteins examples/curatedButterflyProteins.fa **--fafile examples/GCF_009731565.1_Dplex_v4genomic.fa \ _--gtffile examples/GCF_009731565.1_Dplex_v4genomic.gff --masker skip --transcriptIn true \ --lineage lepidoptera_odb10 --annotationalgo Liftoff,Helixer,helixer_trained_augustus \ --helixerModel invertebrate --externalalgo input_transcript,input_proteins --size small \ --proteinalgo miniprot --speciesScientificName Eynnis_tages --fafile examples/monarchGenome.fa \ --gtffile examples/monarchAnnotation.gff3** --runMode laptop --funcAnnotProgram eggnog \ --eggnogDB eggnogDB.tar.gz -profile docker_small
This one is interesting. It looks like Splign is stalled out. This is not usually a process that runs into issues so very unsure why it's stalled. If you have the process logs for that one feel free to send it.
Augustus and all of the rest of the processes are waiting for Splign to finish before they run as the Splign outputs go into the next processes.
And thank you for noticing this I will fix it ASAP. Currently working on making the singularity all run from one large container.
Thank you for the quick reply! I ran the docker liftoff example again, but nextflow keeps printing lines to .nextflow.log
even though no splign processes are running. I also checked the workdir and found splign.gff3 is empty.
Apr-17 15:22:17.159 [Task monitor] DEBUG n.processor.TaskPollingMonitor - !! executor Local > tasks to be completed: 1 -- submitted tasks are shown below
~> TaskHandler[id: 7; name: splign (1); status: RUNNING; exit: -; error: -; workDir: /home/cnrri01/ssd/FLAG/workdir/b6/4e8fca8b11ebc069bc1491ba514ae6]
tree -s -D -h /home/cnrri01/ssd/FLAG/workdir/b6/4e8fca8b11ebc069bc1491ba514ae6/
[4.0K Apr 17 11:15] /home/cnrri01/ssd/FLAG/workdir/b6/4e8fca8b11ebc069bc1491ba514ae6/
|-- [4.0K Apr 17 11:21] 1_folder
| |-- [4.0K Apr 17 11:15] _SplignLDS2_
| | `-- [ 17M Apr 17 11:15] splign.lds2db
| |-- [452K Apr 17 11:19] cdna.compartments
| |-- [127M Apr 17 11:13] cdna.fa
| |-- [2.7M Apr 17 11:16] cdna.fa.ndb
| |-- [7.7M Apr 17 11:16] cdna.fa.nhr
| |-- [586K Apr 17 11:16] cdna.fa.nin
| |-- [ 497 Apr 17 11:16] cdna.fa.njs
| |-- [195K Apr 17 11:16] cdna.fa.nog
| |-- [1.1M Apr 17 11:16] cdna.fa.nos
| |-- [586K Apr 17 11:16] cdna.fa.not
| |-- [ 30M Apr 17 11:16] cdna.fa.nsq
| |-- [ 16K Apr 17 11:16] cdna.fa.ntf
| |-- [195K Apr 17 11:16] cdna.fa.nto
| |-- [319M Apr 17 11:15] genome.fa
| |-- [ 32K Apr 17 11:15] genome.fa.ndb
| |-- [4.4K Apr 17 11:15] genome.fa.nhr
| |-- [ 580 Apr 17 11:15] genome.fa.nin
| |-- [ 516 Apr 17 11:15] genome.fa.njs
| |-- [ 192 Apr 17 11:15] genome.fa.nog
| |-- [ 573 Apr 17 11:15] genome.fa.nos
| |-- [ 488 Apr 17 11:15] genome.fa.not
| |-- [ 79M Apr 17 11:15] genome.fa.nsq
| |-- [ 16K Apr 17 11:15] genome.fa.ntf
| |-- [ 164 Apr 17 11:15] genome.fa.nto
| |-- [ 10M Apr 17 11:21] splign.asn
| |-- [ 0 Apr 17 11:21] splign.gff3
| |-- [ 78K Apr 17 11:21] splign.log
| `-- [965K Apr 17 11:21] splign.out
|-- [4.0K Apr 17 11:20] 2_folder
| |-- [4.0K Apr 17 11:15] _SplignLDS2_
| | `-- [ 14M Apr 17 11:15] splign.lds2db
| |-- [786K Apr 17 11:19] cdna.compartments
| |-- [108M Apr 17 11:13] cdna.fa
| |-- [2.1M Apr 17 11:16] cdna.fa.ndb
| |-- [6.0M Apr 17 11:16] cdna.fa.nhr
| |-- [467K Apr 17 11:16] cdna.fa.nin
| |-- [ 497 Apr 17 11:16] cdna.fa.njs
| |-- [156K Apr 17 11:16] cdna.fa.nog
| |-- [895K Apr 17 11:16] cdna.fa.nos
| |-- [467K Apr 17 11:16] cdna.fa.not
| |-- [ 25M Apr 17 11:16] cdna.fa.nsq
| |-- [ 16K Apr 17 11:16] cdna.fa.ntf
| |-- [156K Apr 17 11:16] cdna.fa.nto
| |-- [319M Apr 17 11:15] genome.fa
| |-- [ 32K Apr 17 11:15] genome.fa.ndb
| |-- [4.4K Apr 17 11:15] genome.fa.nhr
| |-- [ 580 Apr 17 11:15] genome.fa.nin
| |-- [ 516 Apr 17 11:15] genome.fa.njs
| |-- [ 192 Apr 17 11:15] genome.fa.nog
| |-- [ 573 Apr 17 11:15] genome.fa.nos
| |-- [ 488 Apr 17 11:15] genome.fa.not
| |-- [ 79M Apr 17 11:15] genome.fa.nsq
| |-- [ 16K Apr 17 11:15] genome.fa.ntf
| |-- [ 164 Apr 17 11:15] genome.fa.nto
| |-- [ 13M Apr 17 11:20] splign.asn
| |-- [ 0 Apr 17 11:20] splign.gff3
| |-- [239K Apr 17 11:20] splign.log
| `-- [1.2M Apr 17 11:20] splign.out
|-- [319M Apr 17 11:13] Erynnis_tages-GCA_905147235.1-softmasked.fa
|-- [234M Apr 17 11:13] cdna.fa
|-- [234M Apr 17 11:13] formatted_curatedButterflyRNA.fa
|-- [319M Apr 17 11:13] genome.fa
|-- [ 32K Apr 17 11:13] genome.fa.ndb
|-- [4.4K Apr 17 11:13] genome.fa.nhr
|-- [ 580 Apr 17 11:13] genome.fa.nin
|-- [ 516 Apr 17 11:13] genome.fa.njs
|-- [ 192 Apr 17 11:13] genome.fa.nog
|-- [ 573 Apr 17 11:13] genome.fa.nos
|-- [ 488 Apr 17 11:13] genome.fa.not
|-- [ 79M Apr 17 11:13] genome.fa.nsq
|-- [ 16K Apr 17 11:13] genome.fa.ntf
|-- [ 164 Apr 17 11:13] genome.fa.nto
|-- [ 283 Apr 17 11:15] parallel_001.txt
|-- [ 283 Apr 17 11:15] parallel_002.txt
`-- [ 44 Apr 17 11:15] parallel_commands.txt
5 directories, 73 files
I've attached some log files here for reference. Please let me know if you need any other information. nextflow.log command.log command.out.txt command.err.txt splign.log splign.out.txt parallel_001.txt parallel_002.txt parallel_commands.txt
I updated the ncbiclibraries container that splign runs on to hopefully fix the problem you are having.
Tested on a completely fresh Debian system that I just installed nextflow and docker on and it ran fine:
So id try repulling the containers, specifically ghcr.io/formbio/flag_ncbiclibraries:latest and rerunning and fingers crossed it works. The docker should be much more stable than singularity.
I haven't used FLAG in a while, but have you tried running it with the
--annotationalgo Liftoff,Helixer,helixer_trained_augustus
flag?
I noticed that all the successful run screenshots seem to be without Liftoff in the --annotationalgo flag.
Ya I have. I will do a run tomorrow and add a screenshot to the docs.
A screenshot of it working has been added to the readme.md file on the GitHub main branch. This should also help users for reference
Thank you for the help! I have completed a FLAG run using Apptainer, but I encountered significant delays due to the time spent downloading BUSCO lineage files, likely caused by my connection issues. Is there a way to specify a local directory for pre-downloaded BUSCO lineage files, and use the --offline
option for all BUSCO commands within the FLAG pipeline?
Details: I tried to create a FLAG environment with Apptainer using the following commands:
conda create -n flag apptainer
conda activate flag
cp /etc/apptainer/apptainer.config $CONDA_PREFIX/etc/apptainer/
However, the folder /etc/apptainer/ didn't exist. So, I installed Apptainer v1.31 manually, repulled all containers, and reran the pipeline. Unfortunately, it got stuck at the CombineAndFilter step. Upon closer inspection, I found that this was primarily due to my very slow connection, which took hours to download a BUSCO file lepidoptera_odb10.tar.gz
. I experienced the same issue with Docker. So I wonder if it's possible to use --offline
option for all BUSCO commands within the FLAG pipeline.
Thanks for the update! I'm glad it ran for you!
As for the offline mode that's actually a pretty smart idea. I always have fast connection so never experienced this issue but an offline mode is something that should be useful to others. I will put this on my todo list!
On Mon, May 20, 2024, 8:56 PM spoonbender76 @.***> wrote:
image.png (view on web) https://github.com/formbio/FLAG/assets/109210499/c2badc9b-1deb-49af-830a-c03b4073020d Thank you for the help! I have completed a FLAG run using Apptainer, but I encountered significant delays due to the time spent downloading BUSCO lineage files, likely caused by my connection issues. Is there a way to specify a local directory for pre-downloaded BUSCO lineage files, and use the --offline option for all BUSCO commands within the FLAG pipeline?
Details: I tried to create a FLAG environment with Apptainer using the following commands:
conda create -n flag apptainer conda activate flag cp /etc/apptainer/apptainer.config $CONDA_PREFIX/etc/apptainer/
However, the folder /etc/apptainer/ didn't exist. So, I installed Apptainer v1.31 manually, repulled all containers, and reran the pipeline. Unfortunately, it got stuck at the CombineAndFilter step. Upon closer inspection, I found that this was primarily due to my very slow connection, which took hours to download a BUSCO file lepidoptera_odb10.tar.gz. I experienced the same issue with Docker. So I wonder if it's possible to use --offline option for all BUSCO commands within the FLAG pipeline.
— Reply to this email directly, view it on GitHub https://github.com/formbio/FLAG/issues/17#issuecomment-2121558509, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHT22XYMHPW5MPY5OWGKNMDZDKSVFAVCNFSM6AAAAABGDKRE3WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRRGU2TQNJQHE . You are receiving this because you commented.Message ID: @.***>
I am trying the new big singularity image(81.5 G) and from the document I have questions about some commands.
still require root for singularity users. Is this necessary?
And I don't know if this error can be ignored or it has any affects.
-z
if Liftoff is desired? In document it says: If Liftoff is desired the above command can be modified such as below:In
chosen annotation algo
the Liftoff is absent.