Closed cclarkson closed 3 years ago
Thanks Chris.
We start with 809 (total derived samples). Then following sequencing and sequence QC we end up with 699.
In the crosses file that @jonbrenas curated crosses.tsv
, we have 568 samples, of which 519 are present in the 699. The remainder are samples where we do not know the pedigree.
Then in cross.samples.meta.txt
we have 11 crosses. I think that this file is the one I should use, but it needs the other 4 crosses adding.
So I guess, 2 options to fix:
a) NH subset the crosses.tsv
file to those 15.
b) @jonbrenas to add 4 additional crosses to cross.samples.meta.txt
and @hardingnj to point at this file instead.
Crosses genotypes has data from 699 samples, as does the (old) crosses meta dats at
vo_agam_release/v3/metadata/general/AG1000G-X/samples.meta.csv
.The new crosses meta, however, has only 519 rows.
Also, in the text we talk about 15 crosses, five of which are new. If I
df.cross_id.unique()
the new meta data I get 24 named crosses?@hardingnj, any ideas what has happened here?