bebop / poly

A Go package for engineering organisms.
https://pkg.go.dev/github.com/bebop/poly
MIT License
665 stars 70 forks source link

fastq de-aligned #325

Closed Koeng101 closed 1 year ago

Koeng101 commented 1 year ago

https://github.com/TimothyStiles/poly/blob/f587e72731bdfebe2e452604707a4efe8906672a/io/fastq/fastq.go#L120C38-L120C38

Still figuring it out, but noting here: fastq becomes unaligned while reading. Not sure. This causes the entire thing to get messed up.

Koeng101 commented 1 year ago

Found it - https://github.com/TimothyStiles/poly/blob/f587e72731bdfebe2e452604707a4efe8906672a/io/fastq/fastq.go#L210

(string)(unsafe.Pointer(&sequence)), // Stdlib strings.Builder.String() does this so it should be safe.

This is NOT safe.

Koeng101 commented 1 year ago

Alright, I got this error again.

If you redefine the sequence in terms of a new sequence, it is just fine.

sequence = line[:len(line)-1] // Exclude newline delimiter.
    var newSequence []byte
    for _, char := range sequence {
        newSequence = append(newSequence, char)
        if !strings.Contains("ATGCN", string(char)) {
            return Fastq{}, totalRead, errors.New("Only letters ATGCN are allowed for DNA/RNA in fastq file. Got letter: " + string(char))
        }
    }

I have no idea why, but this becomes a thing production. It takes a long ass time to find.