Open hyanwong opened 4 years ago
I reckon I can do something like this
def exclude_id(attribute, value):
return attribute.name != "id"
population_equivalent(sd_pop, ts_pop):
d1 = {k: v for k, v in ts.population(ts_pop).__dict__.items() if k != 'id'}
d2 = {k: json.dumps(v).encode() if k == 'metadata' else v
for k, v in attr.asdict(sd.population(sd_pop), filter=exclude_id)}
return d1 == d2
The only issue is where there are attributes in the sample file that are not in the tree sequence, or vice versa. In particular, I'm thinking about the individuals_time value for sample data files, which has no equivalent in an individual in a tree sequence, until #322 (and after that will require special treatment)
I would like to compare a Population as returned from a
sample_data
file with an Population in a tree sequence. In particular, I would like to test for equality, excluding the id field (perhaps we might call this "equivalence", as inpopulation_equivalent(sd_pop, ts_pop)
. A similar thing goes for an Individual (see also #324 ).Should I wait for sgkit to do this, or is it worth implementing a quick hack now? And what's the best way to do it - can I e.g. simply use
attr.asdict
?