isovic / racon

Ultrafast consensus module for raw de novo genome assembly of long uncorrected reads. http://genome.cshlp.org/content/early/2017/01/18/gr.214270.116 Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/racon
MIT License
257 stars 48 forks source link

Racon_wrapper got an error about file format. #201

Closed ghost closed 2 years ago

ghost commented 2 years ago

Hi, I tried to error correction for my PacBio data by using racon_wrapper as following command.

racon_wrapper --split 500000000 -c 100 -b --cudaaligner-batches 100 \
-t 48 sample.fasta overlaps.paf sample.fasta > corrected.fasta

but I got the error:

[racon::createPolisher] error: file  has unsupported format extension (valid extensions: .fasta, .fasta.gz, .fna, .fna.gz, .fa, .fa.gz, .fastq, .fastq.gz, .fq, .fq.gz)!

It seems that the order in which racon_wrapper adds the parameters is wrong. Specifically, I think there is a problem in lines 120-137 of racon_wrapper.

        racon_params = [RaconWrapper.__racon]
        if (self.include_unpolished == True): racon_params.append('-u')
        if (self.fragment_correction == True): racon_params.append('-f')
        racon_params.extend(['-w', str(self.window_length),
            '-q', str(self.quality_threshold),
            '-e', str(self.error_threshold),
            '-m', str(self.match),
            '-x', str(self.mismatch),
            '-g', str(self.gap),
            '-t', str(self.threads),
            self.subsampled_sequences, self.overlaps, ""]) 
        if (True):
            if (self.cuda_banded_alignment == True): racon_params.append('-b')
            racon_params.extend([
                '--cudaaligner-band-width', str(self.cudaaligner_band_width),
                '--cudaaligner-batches', str(self.cudaaligner_batches),
                '-c', str(self.cudapoa_batches)])
        print(racon_params)

I modified this block as follows, and it worked fine.(I'm still checking...)

        racon_params = [RaconWrapper.__racon]
        if (self.include_unpolished == True): racon_params.append('-u')
        if (self.fragment_correction == True): racon_params.append('-f')
        racon_params.extend(['-w', str(self.window_length),
            '-q', str(self.quality_threshold),
            '-e', str(self.error_threshold),
            '-m', str(self.match),
            '-x', str(self.mismatch),
            '-g', str(self.gap),
            '-t', str(self.threads)
            ])
        if (True):
            if (self.cuda_banded_alignment == True): racon_params.append('-b')
            racon_params.extend([
                '--cudaaligner-band-width', str(self.cudaaligner_band_width),
                '--cudaaligner-batches', str(self.cudaaligner_batches),
                '-c', str(self.cudapoa_batches)])
        racon_params.extend([self.subsampled_sequences, self.overlaps, ""])
        print(racon_params)

This racon_wrapper script was obtained from racon v1.4.22 when it was compiled as follows.

cd build
cmake -DCMAKE_BUILD_TYPE=Release -Dracon_enable_cuda=ON .. -Dracon_build_wrapper=ON
make

Thank you.

rvaser commented 2 years ago

Hello, you are right, the problem is a bit down the script where the last parameter is updated with the target file. I will fix that as soon as possible, thank you for reporting it.

Best regards, Robert

ghost commented 2 years ago

Thanks!

rvaser commented 2 years ago

Finally pushed into new version (1.5.0). Sorry for the delay.