cougarlj / COMPSRA

COMPSRA: a COMprehensive Platform for Small RNA-Seq data Analysis
https://regepi.bwh.harvard.edu/circurna/
GNU General Public License v3.0
16 stars 6 forks source link

zcat does not work in MacOS as intended #40

Open edfajardo opened 2 years ago

edfajardo commented 2 years ago

The default parameter for --readFilesCommand in STAR is zcat. However, when you zcat any file in macOS it will look for a .Z extension and will throw an error:

zcat: can't stat: .... filename.gz.Z: no such file or directory

I can get around this problem when issuing the alignment command directly to STAR by setting the --readFilesCommand switch to an appropriate alternative. However, it is not clear how to set this parameter in COMPSRA.jar. The README file says that the -mp switch is the way to do it, but no further details are provided. Could you provide an example of how to set parameters to the STAR aligner?

Thank you.

E. Fajardo

cougarlj commented 2 years ago

Dear edfajardo,

I'm sorry for the inconvenience. We will do more tests in MacOS in the next release. You should find a file with the name star.para in the ./bundle_v1/configuration/ directory and you can change the star parameters in this file.

If you have nay problem, please let me know.

Best Wishes, Jiang Li

edfajardo commented 2 years ago

Thank you for your reply, Jiang Li. That behavior of zcat in MacOS is very annoying; here is some extra information that might help you in the next release:

Let's say I have a gzipped file called testfile.gz. Then,

the system zcat command, which resides in /usr/bin/, throws an error:

zcat testzip.gz zcat: can't stat: testzip.gz (testzip.gz.Z): No such file or directory

you can still use the system command but modify the instruction:

zcat < testzip.gz # this works as expected

What I did to work around the problem is the following:

1- I have a bin directory in my home folder where I have my scripts and programs (say /Users/myhome/bin) 2- Create an executable file in /Users/myhome/bin named zcat, with the following lines:

!/bin/sh

gzip -dc $1

3- I am using bash as my shell so I add /Users/myhome/bin to the beginning of my $PATH in .bash_profile:

add this line to .bash_profile in your home directory

export PATH=/Users/eduardo/bin:${PATH}

Now, when zcat is called, the system will find my version of it first, and will use it instead the system's version.

I was able to run COMPSRA with this workaround. One last point/question:

The STAR.para file that you describe has 3 lines with parameters, almost identical (I see, for example, that the first two lines have --outSAMtype BAM Unsorted while the third line has --outSAMtype BAM SortedByCoordinate). There might be other differences but I have not checked carefully. I assume that the program is using the first line (because the messages sent to STDOUT say --outSAMtype BAM Unsorted). I am correct in saying that the first line is used? What is the purpose of the other lines in STAR.para?

Thank you.

Eduardo

cougarlj commented 2 years ago

Dear Eduardo,

Thank you for your feedback. We will fix this in the next release. As for the STAR.para file, in default, the fist line of parameter is used. We have used COMPSRA for other projects with different parameters, so there may be command lines left there. You can use "-mp" to chose which line in the STAR.para you want to use, which is convenient to align reads with different parameters (no need to change the parameter every time).

Best Wishes, Jiang Li

edfajardo commented 2 years ago

Thanks, Jiang. I also want to say that the package is very nice. The annotation results look very good (I have not analyzed the data yet, but that's on me). You did a great job.

Eduardo

cougarlj commented 2 years ago

Dear Eduardo,

Thank you for the encouragement. We are working on COMPSRA2 now and hope the new version will be more useful and could help more people in their studies.

Best wishes, Jiang Li

kenminsoo commented 2 years ago

Dear Jiang,

Your package is comprehensive and amazing so far. But I just wanted to give you a suggestion regarding the documentation for the -mp function. It would be nice to just have a little sentence mentioning that -mp expects an integer index as an input that connects to the star.para file. It is actually a really nice function when I think about it, and perhaps I'm the only one who felt that it wasn't intuitive. Best of luck with your development of the second version!

Best, Ken

cougarlj commented 2 years ago

Dear Ken,

Thank you for your suggestion. We will emphasize this function in COMPSRA2 together in the tutorial.

Best regards, Jiang Li

On Mon, Jul 18, 2022 at 7:33 PM kenminsoo @.***> wrote:

Dear Jiang,

Your package is comprehensive and amazing so far. But I just wanted to give you a suggestion regarding the documentation for the -mp function. It would be nice to just have a little sentence mentioning that -mp expects an integer index as an input that connects to the star.para file. It is actually a really nice function when I think about it, and perhaps I'm the only one who felt that it wasn't intuitive. Best of luck with your development of the second version!

Best, Ken

— Reply to this email directly, view it on GitHub https://github.com/cougarlj/COMPSRA/issues/40#issuecomment-1187152984, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGMAD5Q7HY2O55UBSU2LBLVUU6ILANCNFSM5YVU3RQQ . You are receiving this because you commented.Message ID: @.***>