nf-core / smrnaseq

A small-RNA sequencing analysis pipeline
https://nf-co.re/smrnaseq
MIT License
74 stars 125 forks source link

Fix Nextflex 4nt 3p trimming #407

Closed lpantano closed 2 months ago

lpantano commented 2 months ago

Description of the bug

Right now the trimming happens before adapter and that is not correct. Adapters need to be removed first, then 3p trimming. People can use trim3p_nextflex branch for now, if they need to run Nextflex protocol.

To avoid local modules: propose changes is here: https://github.com/nf-core/smrnaseq/pull/386#discussion_r1733450888 from 386 PR.

include { FASTP as FASTP3 } from '../modules/nf-core/fastp'

FASTP3(
            ch_reads_for_mirna,
        [],
        false,
        false,
        false
        )

and stageInMode: 'copy' in the modules.config.

This needs to be tested.

Command used and terminal output

No response

Relevant files

No response

System information

No response

nschcolnicov commented 2 months ago

@lpantano Im looking into this, thanks for creating the issue

nschcolnicov commented 2 months ago

My original thought was that the nf-core/fastp module attempts to create a symlink of the file, within the module execution, and since the .fastq file is already a symlink, it creates a symlink to an empty file. If this was the case, using stageInMode 'copy' should fix it, and it doesn't. I copied the workdirectory and I copied the file manually into the work directory, but it would still generate an empty file after running the symlink command. So it is not an issue on how the file is being staged, but with the symlink command itself. Upon further testing, I saw that the symlink works without any issues, and creates a file that should be good to use, the issue is that fastp is overwriting the input file since the output file name is generated by adding .fastp to the input file, which makes it have the same name as the original input file, which the symlink refers to. I updated the prefix, so that the output file gets renamed to something different from the input file. I'll update the PR

lpantano commented 2 months ago

Thank you, it is tricky. Is the output channel only getting the one fastq file we want, just asking this because this line:

    tuple val(meta), path('*.fastp.fastq.gz') , optional:true, emit: reads

The prefix is not taking in consideration, and I am not sure if more than one file will go into the channel.