jamescasbon / PyVCF

A Variant Call Format reader for Python.
http://pyvcf.readthedocs.org/en/latest/index.html
Other
406 stars 200 forks source link

Iterating through vcf.Reader deletes data #282

Open andyjslee opened 7 years ago

andyjslee commented 7 years ago
file = "some_file.vcf"
vcf_reader = vcf.Reader(open(file, 'r'))
for record in vcf_reader:
  print(record)
for record in vcf_reader:
  print(record)

The second for loop doesn't print anything as vcf_reader is empty. I want to make sure that vcf_reader retain all of the data elements. How do I do this?

chanedwin commented 7 years ago

Hi! this is because vcf_reader is an iterator that iterates through the file once, and not a list. Essentially when you do the for loop you consume the iterator, and so the next time you try to use the same iterator it has no more objects to iterate through. To solve this, you can store all the elements into list using the iterator with the command list_vcf_reader = list(vcf_reader) which you can then iterate through multiple times, or generate a new iterator object when required vcf_reader_2 = vcf.Reader(open(file, 'r')) . The second way is more space efficient.