Open Pentayouth opened 2 years ago
We have provided the program "addXS" in the package. It adds the "XS" field by checking the donor/acceptor motifs. The command is "samtools view -h in.bam | ./addXS reference_genome.fa | samtools view -bS - > out.bam"
But this still takes a while to generate, because it needs to decompress and compress the BAM file. I can add the strand-specific feature, and it should not take long.
Just want to confirm, are the samples under the same strand library? Or it is a mixture? Thank you.
All the samples are the same strand library.
Thanks for the information. I've added the option --stranded to psiclass in the git branch "stranded". Could you please checkout the branch and test whether PsiCLASS generates reasonable results? If so, I will merge this updates to the master branch. You can specify the strand library through the option like "--stranded rf" or "--stranded fr". Thank you!
The branch didn't work properly, I ran:
/public/home/lijing/wangzw/resource/bins/psiclass/psiclass --lb bam.list -p 1 --stranded rf
which threw an error
$ /public/home/wang/resource/bins/psiclass/psiclass --lb bam.list -p 1 --stranded rf
sh: /public/home/wang/resource/bins/psiclass/samtools-0.1.19/samtools: No such file or directory
Found mate read id index suffix(.1 or /1). Calling "--mateIdx 1" option. If this is a false calling, please use "--mateIdx 0".
/public/home/wang/resource/bins/psiclass/junc /public/home/wang/subject/star_new/N1/N1.2pass.Aligned.sortedByCoord.out.bam -a --stranded rf --hasMateIdSuffix > ./splice/psiclass_bam_0.raw_splice
sh: /public/home/wang/resource/bins/psiclass/junc: No such file or directory
Terminated
It seems the program junc and samtools are not compiled. Could you please run "make" to generate those executables? Thank you.
I'm sorry for forgetting the make
step.
Now psiclass is working properly. To my experience the whole process would take 3-4 days on using 23 threads and I will give you feedbacks then.
I really appreciate your continuous support of the program.
I run
/public/home/lijing/wangzw/resource/bins/psiclass/psiclass \
--lb bam.list \
-p 24 \
--stranded rf
the gffcompare result of the psiclass_vote.gtf gave weird results, showing low specificity even at intron level, which is abnormal.
below is stringtie merge
I checked igv and found the software gave assemblies at the opposite strand.
btw, I checked the library strandness using RSeQC
samtools view -Sbh N1_WTS.bam chr22 > chr22.old.bam
infer_experiment.py \
-r gencode.v24.chr_patch_hapl_scaff.annotation.12.bed \
-i chr22.old.bam
the result is:
# This is PairEnd Data
# Fraction of reads failed to determine: 0.0515
# Fraction of reads explained by "1++,1--,2+-,2-+": 0.0318
# Fraction of reads explained by "1+-,1-+,2++,2--": 0.9167
according to this figure (from RSeQC)
my library is 1+-,1-+,2++,2--, which means my library is fr-firststrand (aka RF)
so I used --rf
in stringtie and --stranded rf
in psiclass
Thank you for showing the details. It seems some of the introns are on the right strand while most of them are not. In my test data (even not a stranded library), the strand is the same between PsiCLASS and stringtie. I'll look into this issue by creating a better debugging example. If the chr22.bam file is small, I would appreciate it if you can share the file with me. Thank you!
Ok, I would like to share 3 normal bams covering chr22 with you, all are stranded library. Would you please provide an email address?
Yes, you can use the email lsong@ds.dfci.harvard.edu . One bam file would be sufficient. Thank you!
Please check the email for the download link.
Thank you for providing the test examples! I think I've fixed this issue. Could you pull the new branch, recompile PsiCLASS and give it a try? Thank you for your patience and help.
Thank you. I will give you feedbacks.
The gffcomapre result looks plausible this time. Thank you very much.
Thank you! I will merge this branch to master and release a new version.
Dear author,
I would like to compare the assembly of both psiclass and stringtie.
In stringtie, the user could specify --rf or --fr for strand-specific RNA-seq library instead of output XS tag in the STAR alignment step. So I didn't use --outSAMstrandField intronMotif and thus my bam files do not have XS tag.
I wonder if such bams would influence the psiclass assembly? Or if adding XS tag to bam outputs is indispensable regardless of the strandness of the experiment? Is there any workaround instead of performing the time consuming STAR alignment steps (I have hundreds of samples)?
Best regards, Wang