galaxy001 / pirs

profile basd Illumina pair-end Reads Simulator
https://code.google.com/p/pirs/
GNU General Public License v2.0
26 stars 7 forks source link

Access to the noiseless reads #6

Open kedartatwawadi opened 7 years ago

kedartatwawadi commented 7 years ago

Hi! Thanks for the very useful simulator. I just wanted to ask if during the simulation of FASTQ file, there is any way to access the noiseless read generated from the FASTA file (before performing InDel and substitution), and write them to a file. This would be useful for FASTQ denoising experiments, as we can compare with the noiseless reads.

I see that the Read class has seq, raw_read and ref_read. I believe seq is the final read to be output by the simulator. Was confused, which one of the raw & ref reads are the noiseless reads.

class Read {
public:
    vector<char>   seq;
    vector<char>   raw_read;
    vector<char>   ref_read;
    vector<char>   quality_vals;
    vector<Indel>  indels;
    vector<int>    error_pos;
    ReadPair      &pair;
    int            mask_end_len;

    Read(ReadPair &_pair)
        : pair(_pair), mask_end_len(0)
    { }

    inline int num_in_pair() const;
    inline char orientation() const;
};

It might also be useful to add a mode for this. (ART has a similar facility, inbuilt in the tool)

Thanks!

galaxy001 commented 7 years ago

Our C++ programmer have left to other company, and I just mastered Perl and C now.

You can searching for the mutation part and see which is not mutated. Or just print all 3 out and make a BLAST to confirm.

Personal guess, it should be ref_read.