valegale / ONT_methylation

1 stars 0 forks source link

Restructure #6

Open hoelzer opened 2 days ago

hoelzer commented 2 days ago

Hey @valegale I restructured your pipeline a bit to give you some guidance regarding:

I did not had perfect test data, so please expect maybe some bugs.

I tried my code using some rndm FASTA reference and a minimap2 sorted BAM I had around:

nextflow run main.nf --fasta test-reference.fasta --bam test-reference.bam

If you have multiple references and BAMs you would do:

nextflow run main.nf --fasta '*.fasta' --bam '*.bam'

and then, of course, their .baseNames have to match so that the channel is formed correctly.

I also added a --list parameter, which allows you to provide CSV files as input. This might be handy because then you can do smt like

nextflow run main.nf --list --fasta references.csv --bam mappings.csv

where the content of these CSVs is smt like

sample1,/path/to/reference1.fasta
sample2,/path/to/reference2.fasta
sample3,/path/to/another/reference3.fasta
sample1,/path/to/mapping1.bam
sample2,/path/to/mapping2.bam
sample3,/path/to/another/mapping3.fasta

I hope this restructuring also gives you some guidance how to easily add more processes, parameters, input options: if necessary.

Also, please feel free to rename parameters/channels if you think there are better alternatives.

If you are fine with the changes, please feel free to approve and merge this PR.

hoelzer commented 2 days ago

Ah I also removed your extra gzip FASTQ process bc I did not see why this is needed. Instead I do

    samtools fastq ${bam_file} -T MM,ML | gzip -c - > ${sample_id}/${sample_id}.fastq.gz

and done. EDIT hm... but not sure if my way of directly piping the samtools output into gzip is super slow... o_O EDIT2 okay maybe it was also only slow bc I ran too much in parallel on my macbook

Test is currently running with my rndm example FASTA/BAM.

Please also do some test of this PR before merging. E.g. you can also test via

nextflow pull valegale/ONT_methylation
nextflow run valegale/ONT_methylation -r restructure ...