nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.61k stars 605 forks source link

Nextflow creates object ending with slash during when using publishDir #5074

Open adriannavarrobetrian opened 1 week ago

adriannavarrobetrian commented 1 week ago

Bug report

When using the publishDir directive to send outputs to an s3 location, it looks like Nextflow creates a zero-sized object with a key ending in a slash at the publishDir location. While technically allowed by S3, this creates issues when performing operations on the resulting publishDir location (like recursing over objects or counting the number of objects under a prefix). It will also keep empty prefixes around; the objects themselves cannot be seen in the console, and if you try to copy the prefix using the AWS CLI naively, it fails.

Steps to reproduce the problem

Minimal example to replicate:

process example {
    publishDir "s3://my-bucket/buy-why-nextflow/test"

    input:
    val sample

    output:
    path "*fastq.gz"

    script:
    """
    touch ${sample}.fastq.gz
    """
}

workflow {
    example(Channel.of("SAMP1", "SAMP2"))
}
$ aws s3 ls --recursive s3://my-bucket/test/
2024-06-13 15:15:04          0 test/
2024-06-13 15:15:04          0 test/SAMP1.fastq.gz
2024-06-13 15:15:03          0 test/SAMP2.fastq.gz

Program output

$ aws s3 cp s3://my-bucket/test/ ./
download failed: s3://my-bucket/test/ to ./ [Errno 21] Is a directory: '/some_local_dir/.5beAaC30' -> '/some_local_dir/'

Environment

pditommaso commented 1 week ago

I'm not understanding what's supposed to be prefix in your example.

I've used this process definition

process example {
    publishDir "s3://nextflow-ci/buy-why-nextflow"

    input:
    val sample

    output:
    path "*fastq.gz"

    script:
    """
    touch ${sample}.fastq.gz
    """
}

I'm getting this result that's perfectly fine

2024-06-18 18:16:52          0 
2024-06-18 18:16:52          0 SAMP1.fastq.gz
2024-06-18 18:16:52          0 SAMP2.fastq.gz
adriannavarrobetrian commented 1 week ago

Sorry, I copied the example wrong. It's a folder, I updated it to test.

pditommaso commented 1 week ago

It's essentially the same, I don't see why it should not work