RitchieLabIGH / IRFinder

MIT License
13 stars 10 forks source link

Conda Package? #6

Closed jscaber closed 2 years ago

jscaber commented 2 years ago

Dear Claudio,

do you have plans to update the current conda version of irfinder (1.3.1 on bioconda) to the current package?

Best wishes, Jakub

CloXD commented 2 years ago

Dear Jakub, good question, we haven't thought about it. We currently provide the docker and singularity images that ensure a higher level of reproducibility. I have no experience with the creation of conda packages and at the moment I have not much time to spend on it, but if you or anyone in the community has the motivation to create and test it, we'll be very grateful. That's also one of the greatest strengths of open source projects :) Cheers, Claudio

jscaber commented 2 years ago

Dear Claudio,

many thanks. I've gone with singularity at present, with some success but have hit an error.

command submitted to cluster:

singularity run -H $PWD:/home                 -B /PARENTFOLDER                /PATHTOSINGULARITY/IRFinder BAM                 -r IRFinder.dir/REF                  -d IRFinder.dir/SAMPLE                 -t 1                 SAMPLE.bam

This is the error message I get:

/usr/local/IRFinder/bin/IRFinderBAM: line 163: 48333 Aborted                 ${LIBEXEC}/irfinder ${OUTPUTDIR} ${REF}/IRFinder/ref-cover.bed ${REF}/IRFinder/ref-sj.ref ${REF}/IRFinder/ref-read-continues.ref ${REF}/IRFinder/ref-ROI.bed ${READ_TYPE} "${AI_WARN}:${AI_INTRON}:${AI_RATIO}" "${JITTER}" $1 >> $OUTPUTDIR/logs/irfinder.stdout 2>> $OUTPUTDIR/logs/irfinder.stderr

The two files are: stderr:

terminate called after throwing an instance of 'std::invalid_argument'
  what():  stoul

stdout:

 --------------------
|  IRFinder v. 2.0.0 | 
 --------------------

---
IRFinder version: 2.0.0 
IRFinder start:  Fri Mar 25 16:20:49 GMT 2022
IRFinder runmode: BAM
IRFinder user@host: myself @ myserver
IRFinder working dir:  /WORKDIRPATH/splicing
IRFinder reference: IRFinder.dir/REF
IRFinder file 1: CTR.bam
---
[  Fri Mar 25 16:20:49 GMT 2022  ] Processing the BAM file with IRFinder
---
The given bam file is sorted by coordinate and is paired.
IRFinder run with options:
 - Output Dir:                  IRFinder.dir/CTR
 - Main intron ref.:            IRFinder.dir/REF/IRFinder/ref-cover.bed
 - Splice junction ref.:        IRFinder.dir/REF/IRFinder/ref-sj.ref
 - Read spans ref.:             IRFinder.dir/REF/IRFinder/ref-read-continues.ref
 - Optional ROI ref.:           IRFinder.dir/REF/IRFinder/ref-ROI.bed
 - Read type:                   SR
 - AI levels:                   1:1:0.05
 - Input BAM:                   CTR-841-1.bam

Preparing the reference:
 - Junction count...done.
 - Span points...done.
 - Coverage blocks...

If you have any suggestions, please let me know.

Best wishes, Jakub

CloXD commented 2 years ago

Hello Jackub, there must be a problem with the ref-cover.bed file. It should look something like this:

1   924953  925003  skip    0   +   924953  925003  255,0,0 1   50  0
1   924953  925916  dir/SAMD11/ENSG00000187634/+/1/924948/925921/973/140/clean  0   +   924953  925916  255,0,0 3   191,531,111 0,241,852
1   924953  925916  nd/SAMD11/ENSG00000187634/+/1/924948/925921/973/140/clean   0   +   924953  925916  255,0,0 3   191,531,111 0,241,852
1   925194  925244  skip    0   +   925194  925244  255,0,0 1   50  0
1   925194  925916  dir/SAMD11/ENSG00000187634/+/2/925189/925921/732/90/clean   0   +   925194  925916  255,0,0 2   531,111 0,611
1   925194  925916  nd/SAMD11/ENSG00000187634/+/2/925189/925921/732/90/clean    0   +   925194  925916  255,0,0 2   531,111 0,611
1   925805  925855  skip    0   +   925805  925855  255,0,0 1   50  0

Could you show the first lines of yours? It would be also useful to know what files did you use to generate the reference.

Cheers, Claudio

jscaber commented 2 years ago

Bang-on, it's empty. Instead exclude.omnidirectional.bed is the only file with content, and looks like this:

chr1    0       10050   M
chr1    10450   10530   M
chr1    10650   10750   M

Thanks, I will investigate myself here, but likely first thought is that we have a pipeline that changes the ensembl naming convention from 1 -> chr1 for ensembl/ucsc compatibility. Just for completeness I include the command run.

singularity run -H $PWD:/home                    -B /STAR/hg38_junc101_149.dir, \
/ENSEMBL/geneset_all.gtf.gz,/INDEX/hg38.fa,/RFinder/Human_hg38_nonPolyA_ROI.bed \
IRFinder BuildRefFromSTARRef       -r IRFinder.dir/REF    \
-x /STAR/hg38_junc101_149.dir                     -g /ENSEMBL/geneset_all.gtf.gz    \
-f /INDEX/hg38.fa                     -t 1 -b /IRFinder/Human_hg38_nonPolyA_ROI.bed  > IRFinder.dir/REF.log

I will try by renaming the contigs in ROI.bed first, but if I have no luck I may just revert to FASTQ +/- rebuilding the index.

Best wishes, Jakub

CloXD commented 2 years ago

I'm glad we spot the problem. If you need any help, don't hesitate to ask. Cheers, Claudio

jscaber commented 2 years ago

Thanks, got it to work. As always the error was prosaic - it's visible in my above post. I passed a compressed gtf to -g above. You script is kindly readable so I could find the and repair mistake by reading through the code and rerunning Build-BED-refs.sh ;)!

Just FYI. No exception was raised for this particular error, which is why i missed in in a pipeline - not sure if you wanted to add that. There was however an error message in REF.log.