exaloop / codon

A high-performance, zero-overhead, extensible Python compiler using LLVM
https://docs.exaloop.io/codon
Other
15.01k stars 517 forks source link

how do we do the gzip.readline() in codon ? #488

Closed panxiaoguang closed 11 months ago

panxiaoguang commented 11 months ago

In python, we can use gzip.readline() to read a gzip file line by line, how do we do this in codon?

It seems there is not a readline function for gzip file in condon now.

arshajii commented 11 months ago

Hi @panxiaoguang -- you can use gzip.open to do this:

import gzip

with gzip.open('foo.txt.gz') as f:
    for line in f:
        print(line.strip())
panxiaoguang commented 11 months ago

Hi @panxiaoguang -- you can use gzip.open to do this:

import gzip

with gzip.open('foo.txt.gz') as f:
    for line in f:
        print(line.strip())

Yes, I just want to write a generator to parse Fastq file(NGS sequencing data) like this:

class FastqRecord:
    def __init__(self, title, sequence, quality):
        self.title = title
        self.sequence = sequence
        self.quality = quality

def parse_fastq(fastq_file):
    with gzip.open(fastq_file, 'r') as f:
        while True:
            title = f.readline().strip()
            if not title:
                break  # 文件结束

            sequence = f.readline().strip()
            plus_line = f.readline().strip()
            quality = f.readline().strip()

            yield FastqRecord(title, sequence, quality)

so the gzFile.readline() is easy to use. But I can try using next to do it.

Thanks