velocyto-team / velocyto.py

RNA velocity estimation in Python
http://velocyto.org/velocyto.py/
BSD 2-Clause "Simplified" License
160 stars 83 forks source link

how do i generate the repeat sequences masked gtf file from Ensembl? #385

Open erzakiev opened 10 months ago

erzakiev commented 10 months ago

I want to use ENSEMBL annotations because my CellRanger output was aligned using GRCh38 v 32 (Ensembl 98).

For that matter, I have the fasta sequence of the corresponding release, Homo_sapiens.GRCh38.dna_rm.toplevel.fa.gz, in which the repeat sequences were hard-masked by the Ensembl consortium themselves.

But to use it as an input for velocyto, I need to massage it into a gtf file. How should I do this?