genomic-medicine-sweden / tomte

A nextflow pipeline for analysing expression and splicing in RNA seq data from rare disease patient
MIT License
12 stars 3 forks source link

Add custom genome config file instead of igenomes #64

Closed A97paupic closed 8 months ago

A97paupic commented 11 months ago

Description of feature

Hi, is it feasible for you to accomodate a configuration in which you can choose wheather you would like to use the igenomes files or your own locally produced files?

Kind Regards, Paul

jemten commented 11 months ago

Might be possible. We have actually not tried it using iGenomes, we have always used our local references. The fasta genome and gtf should be igenomesand even the STARindex. I don't know if the VEP cache is there but you could download it or use an image with the cache included. The background files for drop (Outrider and Fraser) is best built from local data but you can use the files provided from the Drop people to get you started. https://www.cmm.in.tum.de/public/paper/drop_analysis/resource/

ramsainanduri commented 11 months ago

Hi I am a colleague of @A97paupic. Here is the more information on the issue.

Certainly, we comprehend your point. Our discussion revolves around the avoidance of repetitive reference file creation. Instead, we aim to streamline the process by supplying various reference files either through the command line or the config file, such as using the --star_index parameter.

However, we encountered a minor bug in the workflows/tomte.nf and subworkflows/local/preparereferences.nf scripts. When providing the star index as a parameter, an error occurred during the alignment step. The issue stems from the input channel not satisfying the expected cardinality (it should be [meta, index], but it was [index] only). Similar challenges were observed with files like the dict file. To address this, we've locally fixed the issue and ensured that it now functions as intended.

jemten commented 11 months ago

Interesting, I think you should try the dev branch. We are working through some of the issues you mentioned https://github.com/genomic-medicine-sweden/tomte/pull/58 And please if you can make a PR to dev if you have fixed some of these obvious bugs 😄

jemten commented 11 months ago

OK, having read your comment a bit more carefully @ramsainanduri and looking at @A97paupic's commitI think I understand a bit more what you want. It would indeed be neat to repurpose or introduce a new variable for iGenomes. I think you run it quite similar to how we run the pipeline in that we use a local parameter file with paths to all our references, using the option -params-file <local_parms>.yaml

---
genome: 'GRCh37'
fasta: '/home/proj/stage/rare-disease/references/tomte_references_1.0/grch37_homo_sapiens_-gencode_pri-.fasta'
fai: '/home/proj/stage/rare-disease/references/tomte_references_1.0/grch37_homo_sapiens_-gencode_pri-.fasta.fai'
gtf: '/home/proj/stage/rare-disease/references/tomte_references_1.0/grch37_gencode_annotation_-v37-.gtf'
star_index: '/home/proj/stage/rare-disease/references/tomte_references_1.0/star'
salmon_index: '/home/proj/stage/rare-disease/references/tomte_references_1.0/salmon'

  # Other parameters
subsample_bed : '/home/proj/stage/rare-disease/references/tomte_references_1.0/grch37_homo_sapiens_hemoglobin_noncanonical.bed'

  # VEP
vep_cache: '/home/proj/stage/rare-disease/references/tomte_references_1.0/ensembl-tools-release-107/cache'
vep_cache_version: 107

  # DROP
reference_drop_count_file: '/home/proj/stage/rare-disease/references/tomte_references_1.0/exported_counts/geneCounts.tsv.gz'
reference_drop_splice_folder: '/home/proj/stage/rare-disease/references/tomte_references_1.0/exported_counts'
reference_drop_annot_file: '/home/proj/stage/rare-disease/references/tomte_references_1.0/sampleAnnotation2.tsv'
gene_panel_clinical_filter: '/home/proj/stage/rare-disease/references/tomte_references_1.0/grch37_gene_panels.bed'

priority: 'development'
clusterOptions: '--qos=low'

Looks like you are using igenomes_base parameter and the igenomes config to do the same thing.

Did I understand you correctly?

ramsainanduri commented 11 months ago

Yes, Something like that is what we are doing.