Open mpoelchau opened 4 years ago
Is the input file for this command sorted bam file? - or should I use indexed bam file as input? (bam.bai file)
The sorted and indexed bam file (.bam file). The regtools documentation doesn't specify how the corresponding .bai file is named - hopefully name.bam.bai, but sometimes some tools expect name.bai.
https://regtools.readthedocs.io/en/latest/commands/junctions-extract/
Right. I found that regtools can't use bam.bai file. Should I rename bam.bai file to something like "indexed-file.bam" before I use regtools and rename it back to bam.bai after regtools finish the process?
Can you try
The input file for the regtools command would then be
Do you mean that rename file .bam.bai to file .bai and use it as input? I tried filename.bai but regtools still couldn't open it. I also tried file .bai.bam and it still couldn't work.
I'm having trouble processing .bed files with the flatfile-to-json.pl script - I will update this issue when I figure it out.
flatfile-to-json.pl (on our servers at least) doesn't work as expected after Jbrowse 1.16.6. See https://github.com/GMOD/jbrowse/issues/1511 for a full description of the issue, and https://github.com/GMOD/jbrowse/issues/1511#issuecomment-636185863 for a suggested solution.
We can't implement the changes Colin recommends, since installing tabix and bgzip requires htslib, and it doesn't install on CentOS6 (or at least I haven't figured out how to; see also https://github.com/NAL-i5K/remap-gff3/issues/35).
For now we can still use our existing workflow with Jbrowse 1.16.5 on our staging and prod sites. Once we move to CentOS8 + Apollo 2.6+, we can revisit this issue.
We've migrated to Centos8 - reopening this issue because we need to change addtrackList.py to use the setup Colin recommends.
@g8tor I gave this a go but I'll need some of your python expertise for this...
Let's try to use regtools for this.
regtools junctions-extract function: https://regtools.readthedocs.io/en/latest/commands/junctions-extract/
parameters to specify:
-m 20
(should match the min intron parameter from hisat)-s 0
(let's assume unstranded for now)o [gggsss]_[assemblyname]_downsampled-RNA-Seq-alignments_[date].bed
(output file should have the same name prefix as everything else, but with .bed extensionthe output file needs to be moved over to our servers, and added to trackList.json.
perl flatfile-to-json.pl --bed OUTPUT-BED-FILE --trackLabel '[gggsss]_[assemblyname]_RNA-Seq-alignments_[date]_junctions' --config '{"style":{"showLabels": false}, "metadata": {BED METADATA BELOW}, "category":"RNA-Seq/Intronic splice junctions" }' --className feature3
BED METADATA "Analysis provider": "i5k Workspace@NAL", "Analysis method": "https://github.com/NAL-i5K/NAL_RNA_seq_annotation_pipeline/", "Data source":"[comma-delimited SRA ACCESSIONS from 'Submission' column in .tsv file]", "Publication status":"Analysis: NA; Source data: see individual SRA accessions", "Track legend":"Intronic junction reads generated by Hisat2 aligner and regtools"