twlab / TEProf2Paper

TEProf2 Pipeline used to find promoters and predict protein sequences from RNA-sequencing data
Other
18 stars 6 forks source link

Step 11: how to design treatment group versus control #14

Closed songlyzz closed 4 months ago

songlyzz commented 6 months ago

Hi nakul: I am using TEProf2 to find TE promoter with different group, but when I run step 11: Quantification processing, sample identification, and final table creation (finalStatisticsOutput.R ) , I am confused how to set -e (default: ''): The label in the treatment file names that will identify them. Whether I only need to add the treatment name, meanwhile which name before step is correct to load here? And if I have four groups in total samples, do I need to rerun step 1 to 11 for Designated samples. Best Wishes, Song.

nakul2234 commented 6 months ago

Hello,

Essentially, the pipeline is assuming that the "treatment" samples have a label in the file name that identifies them. For example, if you have the following 4 files: Treatment1_rep1.bam, Treatment1_rep2.bam, Treatment2_rep1.bam, and Treatment2_rep2.bam. If you want to compare Treatment1 to Treatment2, then you could do -e "Treatment1". Then the files with "TReatment1 in their name will be the first group, and the samples without it will be in a different group.

If you want to do more complex intergroup analysis then just compare 1 group versus another, I would suggest using the ballgown package from stringtie. The output files of the pipeline are compatible with this, and then you can do transcript level analysis to find upregulated transcripts in different groups.

-Nakul