nf-core / hic

Analysis of Chromosome Conformation Capture data (Hi-C)
https://nf-co.re/hic
MIT License
86 stars 55 forks source link

DSL2 implementation of nf-core-hic pipeline #91

Closed nservant closed 1 year ago

nservant commented 3 years ago

Here is a place where we could exchange about the different modules pour DLS2 implementations. I'm putting here the list of modules I have in mind

Quality Controls

Hi-C data processing

From fastq to a list of interactions = sub-workflows

Format convertion

Downstream analysis

All from cool files if possible

ewels commented 3 years ago

A couple of these tools are ready to go already 🎉

Once you have your empty(ish) DSL2 pipeline in place, you should be able to install them with nf-core modules install fastqc for example.

Two nf-core/bytesize talks coming up which will be relevant:

Phil

nservant commented 3 years ago

Thanks @ewels The other thing I was thinking about is that the Hi-C data processing per se (i.e hicpro, hicup, bwa + pair filtering) should rely more on sub-workflows than on modules

koustav-pal commented 3 years ago

Hi @nservant, I can start working on the bam pairing part of the DSL2?

nservant commented 3 years ago

Hi @koustav-pal Yes, if you want you can start looking at all process related to HiC-Pro, ie.

bowtie2_end_to_end trim_reads bowtie2_on_trimmed_reads bowtie2_merge_mapping_steps dnase_mapping_stats combine_mates get_valid_interaction get_valid_interaction_dnase remove_duplicates merge_stats convert_to_pairs

I did not have time to really look at DSL2 so far, but as we discuss before I think the idea would be to have all these process in a sub-workflow. And also, I guess that the bowtie2 modules already available should also work for HiC-pro.

I guess we can work on a new DLS2 branch in the meantime. And please, try to start from the version which is not yet merged, or from my own fork if it's easier because I change some options to ease the usage of the pipeline. Many thanks to you !

koustav-pal commented 3 years ago

Thanks @nservant,

I will start looking into these function. I think I will use the RNA-seq DSL2 as a reference. Do you prefer that I work on a separate fork or can I be added to the repo? In either case, I will create a new branch containing an empty DSL2 template, that I can start populating.

nservant commented 3 years ago

I think the best practice would be that you work on your own fork, and then ask for PR in this one. Thanks

koustav-pal commented 3 years ago

Hi @nservant,

Apologies for the long silence on my end regarding this. I had to deal with an unexpected cancer related family obligation, over the past two months. Things have settled down for now and I think I can now start looking into this with better clarity.

Thanks.

nservant commented 3 years ago

Hi @koustav-pal Sorry about this ! don't worry at all, there is absolutely no emergency ! I'll to release a new version of the pipeline in DSL1 in a couple of days. Then, we can move on DSL2. Best

nservant commented 3 years ago

Hi @koustav-pal, I was wondering if you already moved on the DSL2 implementation ? otherwise, I will try to have a look in the coming months Thanks N

nservant commented 2 years ago

Hi @koustav-pal, Last call :) any news from your side ? otherwise, I'll try to move on it. Best