smarco / gem3-mapper

GEM-Mapper v3
GNU General Public License v3.0
56 stars 17 forks source link

index - Successfully built but ends with error #24

Open descostesn opened 3 years ago

descostesn commented 3 years ago

Hi,

When running the (snakemake) command gem-indexer --input {input} --output data/genome/fasta/mm10/ensembl/gem3/{params.genomeRoot} --threads {threads} --verbose it finishes but with an error:

2021/1/21 14:14:24 --  100% ... done [0.804 s]
2021/1/21 14:14:29 -- [GEM Index '/g/boulard/Projects/MULTIREADS/singularities/gem3/mmusculus_v38.gem' was successfully built in 
9.953 min.] (see '/g/boulard/Projects/MULTIREADS/singularities/gem3/mmusculus_v38.info' for further info)
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>> GEM.System.Error::Signal raised (no=11) [errno=0,Success]
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
srun: error: smer09-3: task 0: Exited with exit code 1

I get the same error with a normal bash command.

Do you know what could be the problem?

Thank you for your help.

smarco commented 3 years ago

Hi,

If the message "GEM Index '/g/boulard/Projects/MULTIREADS/singularities/gem3/mmusculus_v38.gem' was successfully built in 9.953 min." is shown, that means that the index was built correctly but the application during clean-up and termination (e.g., releasing memory, etc).

Nonetheless, it is weird. I can have a look at it. Can you please provide more information (OS, processor, memory, etc) you are using?

Thanks

descostesn commented 3 years ago
LSB Version:    :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description:    CentOS Linux release 7.6.1810 (Core) 
Release:    7.6.1810
Codename:   Core

For the memory and so on this is a hpc. The node was HT,cpu2.5GHz,avx2,net25G.

Thanks!

descostesn commented 3 years ago

The following process is done on ubuntu 18.04.4 LTS, Memory 250,6 Gib, processor Intel® Xeon(R) Gold 5118 CPU @ 2.30GHz × 24, Graphics AMD® Radeon pro wx2100, Gnome 3.28.2, OS type 64-bit, Disk 513,8 GB.

Following the different steps you should in theory be able to reproduce the error.

1) Download the sra file:

wget https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos1/sra-pub-run-5/SRR566890/SRR566890.2

2) Convert it to fastq:

singularity pull shub://descostesn/singularities:parallelfastqdumpv063
mkdir fastqdump-SRR566890.2
singularity exec --bind ./ singularities_parallelfastqdumpv063.sif parallel-fastq-dump -s SRR566890.2 -t 8 -O ./ --tmpdir ./fastqdump-SRR566890.2
mv SRR566890.2.fastq SRR566890.fastq
gzip -c SRR566890.fastq > SRR566890.2.fastq.gz

3) Perform trimming:

singularity pull shub://descostesn/singularities:trimgalorev066
singularity exec --bind ./ singularities_trimgalorev066.sif trim_galore --quality 20 -o ./ --cores 4 SRR566890.2.fastq.gz
gunzip -c SRR566890.2_trimmed.fq.gz > SRR566890.2_trimmed.fq

4) Download mm10 fasta from ensembl:

wget ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.1.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.2.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.3.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.4.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.5.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.6.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.7.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.8.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.9.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.10.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.11.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.12.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.13.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.14.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.15.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.16.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.17.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.18.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.19.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.X.fa.gz ftp://ftp.ensembl.org/pub/release-102/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.chromosome.Y.fa.gz

zcat Mus_musculus.GRCm38.dna.chromosome.1.fa.gz Mus_musculus.GRCm38.dna.chromosome.2.fa.gz Mus_musculus.GRCm38.dna.chromosome.3.fa.gz Mus_musculus.GRCm38.dna.chromosome.4.fa.gz Mus_musculus.GRCm38.dna.chromosome.5.fa.gz Mus_musculus.GRCm38.dna.chromosome.6.fa.gz Mus_musculus.GRCm38.dna.chromosome.7.fa.gz Mus_musculus.GRCm38.dna.chromosome.8.fa.gz Mus_musculus.GRCm38.dna.chromosome.9.fa.gz Mus_musculus.GRCm38.dna.chromosome.10.fa.gz Mus_musculus.GRCm38.dna.chromosome.11.fa.gz Mus_musculus.GRCm38.dna.chromosome.12.fa.gz Mus_musculus.GRCm38.dna.chromosome.13.fa.gz Mus_musculus.GRCm38.dna.chromosome.14.fa.gz Mus_musculus.GRCm38.dna.chromosome.15.fa.gz Mus_musculus.GRCm38.dna.chromosome.16.fa.gz Mus_musculus.GRCm38.dna.chromosome.17.fa.gz Mus_musculus.GRCm38.dna.chromosome.18.fa.gz Mus_musculus.GRCm38.dna.chromosome.19.fa.gz Mus_musculus.GRCm38.dna.chromosome.X.fa.gz Mus_musculus.GRCm38.dna.chromosome.Y.fa.gz > mm10_v38.fa
singularity pull shub://descostesn/singularities:gemv330

5) Here is the gem3 singularity recipe:

Bootstrap: debootstrap
OSVersion: bionic
MirrorURL: http://us.archive.ubuntu.com/ubuntu/

%help
    This recipe was copied from https://github.com/smarco/gem3-mapper/blob/master/docker/Dockerfile in Nov 2020.

%post

    DEBIAN_FRONTEND=noninteractive
    GEM_MAPPER_VERSION=master
    INSTALL_BASE=/software/opt/gem3-mapper
    SRC_BASE=/software/src/gem3-mapper

    apt-get -y update && apt-get -y install gcc git make && \
    apt-get -y clean && apt-get -y autoremove && rm -rf /var/lib/apt-get/lists/* 

    #~~~~~~~~~ GEM 3.3.0 in Nov 2020 ~~~~~~~~~~#
    mkdir -p ${SRC_BASE} && mkdir -p ${INSTALL_BASE} && cd ${SRC_BASE} && \
    git clone --recursive https://github.com/smarco/gem3-mapper.git -b ${GEM_MAPPER_VERSION} ./ && \
    chmod +x configure && ./configure && make && \
    mv ${SRC_BASE}/bin ${INSTALL_BASE} && \
    ln -s ${INSTALL_BASE}/bin/* /usr/local/bin/

    which gem-mapper

%environment
    export PATH="${PATH}:/software/opt/gem3-mapper/bin" 

%labels
    Author Nicolas Descostes
    Version v0.0.1

6) Build the index:

singularity exec --bind ./ singularities_gemv330.sif gem-indexer --input mm10_v38.fa --output ./mm10_v38 --threads 10 --verbose

This should normally give the error:

2021/1/22 16:18:39 --  100% ... done [0.457 s]
2021/1/22 16:18:41 -- [GEM Index './mm10_v38.gem' was successfully built in 10.743 min.] (see './mm10_v38.info' for further info)
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>> GEM.System.Error::Signal raised (no=11) [errno=2,No such file or directory]
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

7) Now trying on the first chromosome:

zcat Mus_musculus.GRCm38.dna.chromosome.1.fa.gz > Mus_musculus.GRCm38.dna.chromosome.1.fa
singularity exec --bind ./ singularities_gemv330.sif gem-indexer --input Mus_musculus.GRCm38.dna.chromosome.1.fa --output ./chr1 --threads 10 --verbose

Same error:

2021/1/22 16:34:58 --  100% ... done [1.897 s]
2021/1/22 16:34:58 -- [GEM Index './chr1.gem' was successfully built in 1.529 min.] (see './chr1.info' for further info)
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>> GEM.System.Error::Signal raised (no=11) [errno=2,No such file or directory]
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

8) Perform the alignment:

singularity exec --bind ./ singularities_gemv330.sif gem-mapper --index mm10_v38.gem --input SRR566890.fastq --output SRR566890.sam --report-file SRR566890.log --mapping-mode 'fast' --max-reported-matches 'all' --threads 10

It gives the error:

2021/1/22 16:26:21 -- [Opening input file 'SRR566890.fastq']
2021/1/22 16:26:21 -- [Outputting to 'SRR566890.sam']
2021/1/22 16:26:21 -- [Loading GEM index 'mm10_v38.gem']
2021/1/22 16:26:25 -- [SE::Mapping Sequences]
GEM::FatalError (mm_allocator.c:227,mm_allocator_allocate_reserve)
 MM-Allocator. Memory request over segment size
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
GEM::Unexpected error occurred. Sorry for the inconvenience
     Feedback and bug reporting it's highly appreciated,
     => Please report or email (gem.mapper.dev@gmail.com)
GEM::Running-Thread (threadID = 4)
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
GEM::Version v3.6.1-25-g82cf-dirty-release
GEM::CMD gem-mapper --index mm10_v38.gem --input SRR566890.fastq --output SRR566890.sam --report-file SRR566890.log --mapping-mode fast --max-reported-matches all --threads 10
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
GEM::Input.State
Sequence (File 'SRR566890.fastq' Line '75344')
@SRR566890.2.166411 
GAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGA
+
@@CFDDDFDDFHHIIIGHIEDGHIIIICDGGIIIIIIIIIGIIICGHIHI

I hope this helps and thank you again!

smarco commented 3 years ago

Well, this certainly helps a lot. Thanks for such a detailed report; I really appreciate it.

Best,

descostesn commented 3 years ago

Hi,

Did you manage to reproduce the bug?

smarco commented 3 years ago

@descostesn not yet. I am quite overloaded at the moment. Accept my apologies. Nevertheless, I should be able to produce a patch soon.

Sorry for the inconvenience.

All the best,