cldf-datasets / hueblerstability

Creative Commons Attribution 4.0 International
0 stars 0 forks source link

Uncoded values #5

Closed xrotwang closed 3 years ago

xrotwang commented 3 years ago

Two features seem to be uncoded in 33 sheets:

TE028 Counter({'': 33, '1': 17, '?': 7, '0': 4})
TE029 Counter({'': 33, '1': 17, '?': 6, '0': 5})

Is this correct?

NataliiaHue commented 3 years ago

Thanks for these checks! Is there a way to automatically remove several features altogether from all the sheets? The features I don't need now and will never need (at times, they are even logically non-independent of other features or are erroneous in design).

xrotwang commented 3 years ago

Let me now which ones and I'll purge them from the raw sheets.

NataliiaHue commented 3 years ago

Thank you very much!!!

"TE001", "TE002", "TE009", "TE012", "TE014", "TE015", "TE016", "TE022", "TE025", "TE026", "TE028", "TE029", "TE033", "TE034", "TE036", "TE040", "TE041", "TE042", "TE043", "TE044", "TE045", "TE046", "TE047", "TE048", "TE049", "TE051", "TE055", "TE056", "TE057", "TE058", "TE060", "TE061", "TE062", "TE063", "TE064", "TE065", "TE067", "TE068", "TE069", "TE070", "TE071", "TE072", "TE073", "TE074", "TE076", "TE077", "TS081", "TS082", "TS083", "TS083", "TS084", "TS085"

There are also some Grambank features, which I don't use in my analysis because they are non-binary, but I guess it would be better to leave them in, what do you think?

NataliiaHue commented 3 years ago

Do you think it would be easy to import my data into Grambank? Most of it is there, but not all. Or will I have to submit it separately e.g. to Harald?

xrotwang commented 3 years ago

I think merging your sheets into Grambank wouldn't be much of a problem. Maybe coordinate a date with Hedvig when this could/should happen.

NataliiaHue commented 3 years ago

ok, thank you!

xrotwang commented 3 years ago

Ok, removal of the obsolete features removes this issue and #4 and reduces the missing refs to

('Janhunen', '2003') 162
('Pakendorf', 'in preparation') 53
('Janhunen', '2010') 7
('Yakup', 'in press') 2
('Pakendorf', '430-445') 2
('Oskolskaya', 'p.c.') 1
('Robbeets', '2017') 1