Transipedia / dekupl-run

Identify differentially expressed k-mers between RNA-Seq datasets
MIT License
11 stars 11 forks source link

modify Snakefile and add option in conf file to run dekupl on ensembl… #30

Open sebriq opened 6 years ago

sebriq commented 6 years ago

Hello, in the past I changed the code of the dekupl-run Snakefile and dekupl-annotation to use it on ensembl official gff/gtf, making possible to run de-kupl on non-human models. My initial tests were based on the zebrafish annotation for example. I tried to make this version with the minimum of modifications and exclusively on dekupl-run. To be brief the genes names and other informations are stored differentially in the fasta and I changed the extraction in consequence. The config file contain now the "annotation_type" that can be "gencode" or "ensembl", it works as an option. Finally the gff file of ensembl don't contain the ENSGXXXXXXXX .X information, so I removed it when the gene reference is extracted form the fasta. Otherwise dekupl-annotation getSwitches.R code will output nothing. DEKUPL-ANNOTATION IS NOT FULLY TESTED WITH THIS VERSION. I'm afraid that i must remove the .X information in transcript references too, even if I'm not sure that transcript reference is used in gff management of dekupl-annotation.

ps : it lack a space in annotation download commande, do not accept this modification.

jaudoux commented 6 years ago

Hi @sebriq, I had not seen the pull request. It is a good idea, thanks. However, could you update the README with the informations regarding the new option before I merge the two branches?