Closed dfermin closed 5 years ago
Hi @dfermin, to be honest I haven't tried running portcullis with that many samples. I suspect you are encountering the command line length limit on your system. There are ways of modifying this e.g. MAX_ARGS env variable. However, I think it may make sense to sort and merge all those BAMs individually up front, then feed that merged BAM into portcullis. Portcullis would do this internally anyway. You can then optionally delete the merged BAM after portcullis has run if you are tight on storage space.
Another option would be to run portcullis independently on each BAM. There are pros and cons of doing this depending on the levels of coverage in each file. You can then use junctools to combine the results, either taking the intersection or union of each junction.
One other point is that I'd recommend you reduce the number of threads. This will significantly increase the amount of memory required without speeding portcullis up much. I start seeing diminishing returns after 8 threads.
Hi I'm trying to run portcullis on my entire RNAseq data set. I ran it successfully on a few files and the results look good. Here is the command I used:
portcullis full -t 48 /data/hs37d5.fa ../star.bam.files/s12.bam ../star.bam.files/s13.bam ../star.bam.files/s23.bam
I have a total of 275 bam files I'd like to process with portcullis but when I try this command:
portcullis full -t 48 /data/hs37d5.fa ../star.bam.files/*.bam
I get the error message:
Error: Parsing Command Line: too many positional options have been specified on the command line
I'm guessing that the program can't expand the command line argument of that size. What is the proper way to run portcullis in this instance?
Thanks