mhalushka / miRge3_build

Enables building small-RNA libraries for organism of choice to use in miRge3.0 pipeline
MIT License
2 stars 1 forks source link

where to get --pre-trna? #4

Closed IanMcDowell closed 1 year ago

IanMcDowell commented 2 years ago

I want to create a new human reference that includes some custom synthetic miRNAs. It seems like I can't add a new set of miRNAs to an already existing reference, so I have to create a new reference from scratch.

--mature-mir, --hpin-mir, and --ann-gff from miRBase (https://www.mirbase.org/ftp.shtml) --snorna, --rrna, --ncrna-other, --mrna from GENCODE (https://www.gencodegenes.org/human/) --gen-repeats from UCSC table browser (https://genome.ucsc.edu/cgi-bin/hgTables) --mature-trna from GtRNAdb (http://gtrnadb.ucsc.edu/genomes/eukaryota/Hsapi38/Hsapi38-seq.html) or from converting GENCODE gtf to fasta (https://www.gencodegenes.org/human/)

However, it is not clear to me where I can get --pre-trna, any suggestions? GtRNAdb seems to only provide mature tRNAs as does the GENCODE gtf. Also, does the rest of the above seem correct?

Thanks for the great set of tools.

arunhpatil commented 2 years ago

Hi @IanMcDowell,

The quickest and easiest way is to append the existing libraries. We have encountered a similar issue in the past and over there the user wanted to tweak spike-ins. You can refer to the following link (FAQ) to tweak your libraries easily, where instead of spike-ins you can use mature miRBase miRNAs for example. Please let me know if this helps. (Note: Please refer to append in Comment #27)

FAQs

Arun

IanMcDowell commented 1 year ago

Hi Arun,

Pardon me for commenting on a closed issue, but it turns out that this solution will not work for me. I'm interested not only in the quantification of custom synthetic miRNAs, but also their fidelity of processing. Thus, they need to be added to the genome, the gff file, etc., for proper isomiR quantification. May I ask where you got your --pre-trna files (and any other sources you would like to share for miRBase)?

Thanks, Ian

arunhpatil commented 1 year ago

Hi @IanMcDowell,

No worries about the closed issue comment. We updated miRge3.0 release on PyPi very recently and similarly we are getting this up on Conda, with some more new features (v.0.1.0 and v.0.1.1). We are testing and fixing any last minute technical issues, and I will get back to you at the earliest.

A quick question, how many custom synthetic miRNAs are you planning to include in the existing database/library?

Thank you, Arun.

arunhpatil commented 1 year ago

Hi @IanMcDowell,

The answer to your question is below, (we have followed the miRge2.0 based method of generating the libraries), specifically, tRF library generation is mentioned the paper: Looney MM, Lu Y, Karakousis PC, Halushka MK. (2021). Mycobacterium tuberculosis Infection Drives Mitochondria-Biased Dysregulation of Host Transfer RNA-Derived Fragments. J Infect Dis.. 223, 1796-805

tldr

For precursor tRNA 3’-trailer, we extracted the 100 bp downstream of the 3’-end of tRNA genes where the transcription direction is the same as that of the tRNA gene.

Please let me know if you need help in building custom library.

Thank you, Arun.

IanMcDowell commented 1 year ago

...finally returning to this work. Planning on adding on the order of ~20 custom synthetic miRNAs.

arunhpatil commented 1 year ago

@IanMcDowell,

That is great. Happy new year. I am customizing another script to extend libraries to other species with ease. I hope to have it as soon as possible. Regarding the custom synthetic miRNAs, you can have it as spike-ins or you can append it at the end of miRBase or MirGeneDB miRNA database and re-index it using bowtie-build. Let me know if you need any further information regarding this.

Thank you, Arun.

IanMcDowell commented 1 year ago

Here's what worked for me in adding custom synthetic miRNAs to your precomputed miRBase reference:

I am not interested in identifying novel miRNA, so I suspect that some of the files above may be superfluous.

Thanks again, Ian