qqwang-berkeley / JUM

A tool for annotation-free differential analysis of tissue-specific pre-mRNA alternative splicing patterns
MIT License
27 stars 11 forks source link

regarding input files #20

Closed MurliNair closed 5 years ago

MurliNair commented 5 years ago

Hi Qingqing, When I run STAR with the options you have specified here "https://github.com/qqwang-berkeley/JUM/wiki/0.b.-Input-files" , it does not create the Log.final.out. Can I ignore that, since it is just a log file? Also, regarding 2nd pass mapping the option you have specified is as follows STAR --runThreadN 3 --genomeDir genome_index_STAR_r1 \ --outFileNamePrefix ctrl_1 \ --readFilesIn ctrl_1_R01.fastq ctrl_1_R02.fastq \ --outSJfilterReads Unique \ --outSAMstrandField intronMotif \ --outFilterMultimapNmax 1 \ -sjdbFileChrStartEnd 1st_SJ/ctrl_1SJ.out.tab 1st_SJ/ctrl_2SJ.out.tab 1st_SJ/ctrl_3SJ.out.tab \ 1st_SJ/treat_1SJ.out.tab 1st_SJ/treat_2SJ.out.tab 1st_SJ/treat_3SJ.out.tab

My question is regarding the order of input files to be specified with --readFilesIn switch. I presume I specify it in the order the *SJ.out.tab are given. Or do I specify the fastq files for only ctrl_1, since the outFileNamePrefix is ctrl_1. I would appreciate if you could clarify this for me. Thanks ../Murli

qqwang-berkeley commented 5 years ago

Hi Murli,

It is likely that STAR did not finish running or got terminated, because otherwise you should see the Log.final.out file. This file records important information as how many total reads are there, how many get uniquely mapped, etc. I suggest you make sure your STAR running is complete before moving on to downstream analysis with JUM. One way to check is to see if there is a STAR_tmp folder in the working directory where you ran STAR. This folder has all the temporary files stored during STAR running, and it should be removed once STAR finishes successfully. See if this folder is there. If so, you need to run STAR again and make sure it is completed.

For the 2nd pass, I recommend you submit one STAR mapping command for each sample you have. For example, for ctrl_1, you should just specify the fastq files for ctrl_1 (ctrl_1_R1.fastq and ctrl_1_R2.fastq, if paired-end RNA-seq, for example). If you have three controls and three knockdowns, for example, then you run six STAR mapping commands in total.

For the *SJ.out.tab, you should always provide all SJ.out.tab files from all samples to each mapping job for each sample. The reason is that you want the mapping softare to be aware of potential splice junctions that have been detected in all samples.

Let me know if this is clear or if you have any other questions.

Qingqing

On Mon, Jan 21, 2019 at 11:43 AM Murli notifications@github.com wrote:

Hi Qingqing, When I run STAR with the options you have specified here " https://github.com/qqwang-berkeley/JUM/wiki/0.b.-Input-files" , it does not create the Log.final.out. Can I ignore that, since it is just a log file? Also, regarding 2nd pass mapping the option you have specified is as follows STAR --runThreadN 3 --genomeDir genome_index_STAR_r1 --outFileNamePrefix ctrl_1 --readFilesIn ctrl_1_R01.fastq ctrl_1_R02.fastq --outSJfilterReads Unique \ --outSAMstrandField intronMotif --outFilterMultimapNmax 1 -sjdbFileChrStartEnd 1st_SJ/ctrl_1SJ.out.tab 1st_SJ/ctrl_2SJ.out.tab 1st_SJ/ctrl_3SJ.out.tab 1st_SJ/treat_1SJ.out.tab 1st_SJ/treat_2SJ.out.tab 1st_SJ/treat_3SJ.out.tab

My question is regarding the order of input files to be specified with --readFilesIn switch. I presume I specify it in the order the *SJ.out.tab are given. Or do I specify the fastq files for only ctrl_1, since the outFileNamePrefix is ctrl_1. I would appreciate if you could clarify this for me. Thanks ../Murli

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/qqwang-berkeley/JUM/issues/20, or mute the thread https://github.com/notifications/unsubscribe-auth/AZPn2xeOmv1kectM3tKyS-uy5qtvdeNPks5vFhhkgaJpZM4aLbRS .

MurliNair commented 5 years ago

Thanks Qingqing, only my first set was not completed for some reason. I reran it and it created the appropriate log files. I really like your PNAS paper. I am sure I will have some questions as I process my data. Cheers../Murli

MurliNair commented 5 years ago

Hi Qingqing, Everything running fine so far. Just wanted to check with you about the files in the directory named temp_JUM_A_run . Are those temporary files ? I presume it persists even after the run is complete? Thanks ../Murli

qqwang-berkeley commented 5 years ago

Hi Murli,

Yes these are temporary files. They are mostly for debugging purposes, and will still be there after the run is complete. Once you are sure everything is going well and you need more space, you can consider deleting the folder yourself. For now, you can just leave it there.

Qingqing

On Wed, Jan 23, 2019 at 11:04 AM Murli notifications@github.com wrote:

Hi Qingqing, Everything running fine so far. Just wanted to check with you about the files in the directory named temp_JUM_A_run . Are those temporary files ? I presume it persists even after the run is complete? Thanks ../Murli

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/qqwang-berkeley/JUM/issues/20#issuecomment-456926200, or mute the thread https://github.com/notifications/unsubscribe-auth/AZPn283eiU6-3c95GKMGqhve3C96201vks5vGLIugaJpZM4aLbRS .

MurliNair commented 5 years ago

Thanks, shall proceed accordingly. Cheers../Murli