AndersenLab / annotation-nf

Annotate VCF with snpeff and bcsq
1 stars 1 forks source link

Species Specific Key Files replaced by GFF Assignees: @mckeowr1 Labels: bug #6

Closed mckeowr1 closed 1 year ago

mckeowr1 commented 1 year ago

The changes to make_flat_file in commit e36f6ce result in incorrect gene names being generated for cb. We need to use species specific name keys to correctly add the "common_name" for CB. ATM the name key is the GFF

if(params.snpeff){
    bcsq_parse_samples.out
        .combine(bcsq_parse_scores.out)
        .combine(Channel.fromPath("${reference_dir}/csq/${params.species}.gff"))
        .combine(snpeff_annotate_vcf.out.snpeff_flat) | make_flat_file_snpeff
//.combine(Channel.fromPath("${workflow.projectDir}/bin/wormbase_name_key.txt")) | make_flat_file

After adding specific key files we need to update the function to read-in and name the key file in make_flat_file.R and make_flat_file_snpeff.R

mckeowr1 commented 1 year ago

We need to update the species specific inputs to make_flat_file.

The inputs are:

tuple file(bcsq_samples_parsed), file(div_file), file(bcsq_scores), file(wbgene_names), file(snpeff)

We define these inputs in the workflow as:

    bcsq_parse_samples.out
        .combine(bcsq_parse_scores.out)
        .combine(Channel.fromPath("${reference_dir}/csq/${params.species}.gff"))
        .combine(snpeff_annotate_vcf.out.snpeff_flat) | make_flat_file

file(bcsq_samples_parsed), and file(div_file) are produced by bcsq_parse_samples The scores come from bcsq_parse_scores and instead of the $wbgene_names key we are currently passing the gff file. Updates:

mckeowr1 commented 1 year ago

The elegans name key is unique and format and we should generate one similar to CB and CT.