Closed Maarten-vd-Sande closed 4 years ago
Should I worry about the CI/appveyor build failures?
@Maarten-vd-Sande Thanks for the contribution. You shouldn't worry about the Appveyor errors - that integration hasn't been maintained, and I'll likely stop testing on Windows in the future (unless someone wants to figure out the Windows-specific quirks that are preventing the tests from running). One question I have is about the use case for this change. I have no doubt that making the Fasta
class act more like a dict
is a good idea, but I do not expect that most users will want Fasta.values()
to return the IndexRecord
instances. If I'm misunderstanding please let me know.
It seems like a more logical return value would be a list of FastaRecord
instances, to match the behavior of Fasta.__contains__
, Fasta.__iter__
and Fasta.__getitem__
.
What I'm proposing then for Fasta.items()
would be something like:
>>> from pyfaidx import Fasta
>>> x = Fasta('/Users/matt/Downloads/gencode_vM12.fa')
>>> tuple(zip(x.keys(), x))[:10]
(('dfam_5S', FastaRecord("dfam_5S")), ('dfam_7SK', FastaRecord("dfam_7SK")), ('dfam_7SLRNA', FastaRecord("dfam_7SLRNA")), ('dfam_ACRO1', FastaRecord("dfam_ACRO1")), ('dfam_ALR', FastaRecord("dfam_ALR")), ('dfam_ALRa', FastaRecord("dfam_ALRa")), ('dfam_ALRb', FastaRecord("dfam_ALRb")), ('dfam_AluJb', FastaRecord("dfam_AluJb")), ('dfam_AluJo', FastaRecord("dfam_AluJo")), ('dfam_AluJr', FastaRecord("dfam_AluJr")))
Thanks for your reply. You are absolutely right! I will change it.
Would you object if I change this to methods in the Fasta
class instead of references to the odict instances? I think that's easier to implement, more pythonic, and it took me actually a minute to find where and how the keys() was implemented.
Something like:
class Fasta:
...
def keys(self):
return self.faidx.index.keys
def values(self):
return [fastarecord for fastarecord in self]
def items(self):
return tuple(zip(self.keys(), self.values()))
@Maarten-vd-Sande Sounds good. I was thinking along the same lines, and fully support this change.
@mdshw5 I haven't really looked into the code thoroughly, but wouldn't it make more sense to just refer to Fasta.records
, instead of this:
def keys(self)
return self.faidx.index.keys()
def values(self):
return [fastarecord for fastarecord in self]
def items(self):
return zip(self.keys(), self.values())
to this:
def keys(self):
return self.records.keys()
def values(self):
return self.records.values()
def items(self):
return self.records.items()
This should effectively do the same right? In my opinion the second option is much more clear.
This makes sense, but I will refactor the section where Fasta.records
is initialized so we can repurpose Fasta.keys
:
Okay, nice! :tada:
Thanks for your time
@mdshw5 dont merge yet! The tests are failing for older python versions.
Perhaps related to dictionary ordering?
:) Yes, I was just looking into this.
I don't have time for it today, but I don't mind taking a look at it somewhere this or next week.
I've tagged v0.5.7 for release.
It is a very minor thing, but now
pyfaidx.Fasta
behaves slightly more dictionary-like with.items
and.values
.