NCBI-Hackathons / NovoGraph

NovoGraph: building whole genome graphs from long-read-based de novo assemblies
MIT License
44 stars 8 forks source link

running BAM2MAFFT. public.q issue #39

Closed SAMtoBAM closed 3 years ago

SAMtoBAM commented 4 years ago

Hello there,

I am running he CALLMAFFT.pl script

I get this error message

Identified 16 directories.
 - chr11
 - chr5
 - chr6
 - chr15
 - chr1
 - chr10
 - chr14
 - chr4
 - chr12
 - chr16
 - chr13
 - chr3
 - chr9
 - chr7
 - chr2
 - chr8
Unable to run job: Job was rejected because job requests unknown queue "public.q".
Exiting.
qsub qsub temp_qsub failed at ../../../NovoGraph-master/scripts/CALLMAFFT.pl line 435.

It appears to be a problem due to the qsub command It doesn't like the -q option given

When I check the grid engine being used here with "qsub -help" It tells me I am running "GE 6.2u5" incase that is important

Thank you

evanbiederstedt commented 4 years ago

This might require some debugging on your part.

https://github.com/NCBI-Hackathons/NovoGraph/blob/master/scripts/CALLMAFFT.pl#L388-L422

We've done our best to work with various job schedulers, but there could still be problems of course. It will be tricky to debug from here, so if you have solutions, please make a PR.

Based on some googling, this error has come up before: https://github.com/PacificBiosciences/FALCON-integrate/issues/128

Thanks, Evan

SAMtoBAM commented 4 years ago

Hi Evan,

Yes, I appear to not be alone with this issue on running jobs on a grid engine However, it appears to require configuration of the grid engine, which is beyond me...

Is there any way to run this without using the sun grid engine?

evanbiederstedt commented 4 years ago

You should be able to run the script without the --qsub flag:

https://github.com/NCBI-Hackathons/NovoGraph/blob/master/scripts/CALLMAFFT.pl#L381-L438

SAMtoBAM commented 4 years ago

I have run it without the qsub flag and get the same error unfortunately

evanbiederstedt commented 4 years ago

Yes, that's because of this line:

https://github.com/NCBI-Hackathons/NovoGraph/blob/master/scripts/CALLMAFFT.pl#L54

my $qsub = 1;

Try changing to my $qsub = 0;---I'll revise in the scripts.

evanbiederstedt commented 4 years ago

I haven't tried it, so please test it out: https://github.com/NCBI-Hackathons/NovoGraph/pull/40

CC @SAMtoBAM

SAMtoBAM commented 4 years ago

So I changed qsub to 0 and got this error instead now:

`Identified 16 directories.

evanbiederstedt commented 4 years ago

https://github.com/NCBI-Hackathons/NovoGraph/blob/master/scripts/MSAandBAM.pm

This may need some commentary from @TorHou

@SAMtoBAM

I believe that we need to add use MSAandBAM; at the top of the script CALLMAFFT.pl. I updated the pull request.

Please try it out, and let me know how it works.

evanbiederstedt commented 4 years ago

@SAMtoBAM Did that work for me? Please let me know.

Thanks, Evan

SAMtoBAM commented 4 years ago

@evanbiederstedt Unfortunately it did not, It fails to compile as it cannot locate MSAandBAM

Can't locate MSAandBAM.pm in @INC (you may need to install the MSAandBAM module) (@INC contains: /home/samuel/perl5/lib/perl5 /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.22.1 /usr/local/share/perl/5.22.1 /usr/lib/x86_64-linux-gnu/perl5/5.22 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.22 /usr/share/perl/5.22 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base .) at ../../../NovoGraph-master/scripts/CALLMAFFT.pl line 16. BEGIN failed--compilation aborted at ../../../NovoGraph-master/scripts/CALLMAFFT.pl line 16.

evanbiederstedt commented 4 years ago

You'll have to locate the perl module: https://stackoverflow.com/questions/841785/how-do-i-include-a-perl-module-thats-in-a-different-directory

TorHou commented 4 years ago

Yes that was my mistake. @evanbiederstedt , could you add the lines

use FindBin;                 # locate this script
use lib "$FindBin::Bin/..";  # use the parent directory

before use MSAandBAM to your PR and I will merge it.

SAMtoBAM commented 4 years ago

Tried adding that and got the same error message

Instead I just moved 'use MSAandBAM' to below the line already using FindBin to get all the modules in the script directory so part looks like that now: use FindBin; use File::Spec; use List::Util qw/shuffle/; use Cwd; use lib $FindBin::Bin; # finds all modules in script directory use MSAandBAM; my $current_dir = getcwd;

This seemed to work, for awhile then this: `Identified 16 directories.

chr11
chr5
chr6
chr15
chr1
chr10
chr14
chr4
chr12
chr16
chr13
chr3
chr9
chr7
chr2
chr8
Call myself for chunk 0
Go from line 0 to 4
Processing /home/samuel/1.phenovar_fq/genome_graph/novograph/intermediate_files/forMAFFT/chr13/chr13_87.fa
Executing /usr/bin/mafft --retree 1 --maxiterate 0 --quiet /home/samuel/1.phenovar_fq/genome_graph/novograph/intermediate_files/forMAFFT/chr13/chr13_87.fa.tmp_in > /home/samuel/1.phenovar_fq/genome_graph/novograph/intermediate_files/forMAFFT/chr13/chr13_87.fa.tmp_out
[main_samview] fail to read the header from "-".
[main_samview] fail to read the header from "-".
[main_samview] fail to read the header from "-".
[main_samview] fail to read the header from "-".
[main_samview] fail to read the header from "-".
File /home/samuel/1.phenovar_fq/genome_graph/novograph/intermediate_files/forMAFFT/chr13/chr13_87.bam not there, but after perl ../../../NovoGraph-master/scripts/fas2bam.pl --input /home/samuel/1.phenovar_fq/genome_graph/novograph/intermediate_files/forMAFFT/chr13/chr13_87.mfa --output /home/samuel/1.phenovar_fq/genome_graph/novograph/intermediate_files/forMAFFT/chr13/chr13_87.bam --ref "ref" --bamheader windowbam.header.txt --samtools_path /usr/local/bin/samtools it should be - attempts 5 - last exit status 36096! at /home/samuel/NovoGraph-master/scripts/MSAandBAM.pm line 44.
Command perl /home/samuel/NovoGraph-master/scripts/CALLMAFFT.pl --mafftDirectory /home/samuel/1.phenovar_fq/genome_graph/novograph/intermediate_files/forMAFFT --action processChunk --chunkI 0 --chunkSize 5 --mafft_executable /usr/bin/mafft --fas2bam_path ../../../NovoGraph-master/scripts/fas2bam.pl --samtools_path /usr/local/bin/samtools --bamheader windowbam.header.txt failed at ../../../NovoGraph-master/scripts/CALLMAFFT.pl line 446.
`

It generates a mfa file for the sections it begins to analyse, above it is chr13_87 which appears fine but then it also has an empty bam

TorHou commented 4 years ago

If I read this correctly then the exit code for the last attempt is: 36096. Which, following this discussion seems to indicate that the process received a SIGPIPEsignal.

evanbiederstedt commented 3 years ago

Did that resolve your issue @SAMtoBAM ?