drivenbyentropy / aptasuite

A full-featured bioinformatics software collection for the comprehensive analysis of aptamers in HT-SELEX experiments.
https://drivenbyentropy.github.io/
GNU General Public License v3.0
24 stars 11 forks source link

AptaPLEX cycles export #5

Closed PJpb closed 6 years ago

PJpb commented 6 years ago

Hi, When using option -parse -export cycles my exported .fasta data looks like that: >AptaSuite_1|test|length=43 71846584656767847167657167847165717167716767716571716565676767716771846567718471848471 Why and how to change it to sequences?

Best, PJ

drivenbyentropy commented 6 years ago

Hi and thank you for reporting this.

This is a bug and it will be fixed in the next release. Internally, AptaSUITE stores sequences as byte arrays and I forgot to convert them back to string before writing them to file. What you see here is the ASCII representation of the sequences, i.e:

Meanwhile, you could try exporting the sequences in fastq format (configuration parameter Export.SequenceFormat = fastq) which should not have this issue and then use a converter to fasta.

My apology for this and thanks again!

drivenbyentropy commented 6 years ago

I have published a new release.

Could you please try with version v0.4.3 and let me know if it fixes the issue?

Thanks!

PJpb commented 6 years ago

Hi, The bug is fixed, thank you for your quick reaction! best, PJ

edit: I think you've left the debugging 'on', the program produces quite a lot of [main] DEBUG lines at the beginning.

drivenbyentropy commented 6 years ago

Thanks for getting back to me. I will remove the debugging information (which are from third party libraries) with the next version.

PJpb commented 6 years ago

Hi, I did a recheck on the data exported, and unfortunately it's still bugged. Option -export cycles to fasta format exports the sequences with the right length, but they start with the primer region (which should be removed). In effect the data exported is the primer region and a truncated random region. (fastq export works fine) Example:

Config file: Experiment.primer5 = GTATACCTGCAGCTGAGG Experiment.primer3 = GATGACACTACGTGACCA

(Random region of my aptamers was 44 nucleotides, but it was not set in the config file)

Analyzed sequence:

@MG00HS14:636:C8CGUACXX:5:1101:2735:1985 1:N:0:CTTGTA TTGTAGACTCGGTATACCTGCAGCTGAGGTTGCCGCGCACCAGTCGTTCATAGATGTCGTTGGCGTGTTGCGC GATGACACTACGTGACCACGAGTCGCAGA

Exported:

AptaSuite_1|test|length=44 GTATACCTGCAGCTGAGGTTGCCGCGCACCAGTCGTTCATAGAT

drivenbyentropy commented 6 years ago

You are correct, this is indeed another bug which was introduced when I failed to update this export class upon some changes in representing aptamers internally.

Please let me know if v0.3.4b fixes the issue.

Thanks again for reporting this.

PJpb commented 6 years ago

It's fixed in v0.3.4b, many thanks!