D-PLACE / dplace-dataset-sccs

D-PLACE dataset derived from Murdock and White 1969 'Standard Cross-Cultural Sample'
https://escholarship.org/uc/item/62c5c02n
Other
0 stars 0 forks source link

How to most efficiently correct variable titles and definitions? #5

Open kirbykat opened 7 years ago

kirbykat commented 7 years ago

Many variables in the SCCS have had their titles and code definitions shortened in the digitization process (i.e., the original digital files, available from World Cultures, provide only these shortened titles/definitions).

A lot of information is lost in the process!

E.g., v116, which in the digital SCCS (and now in D-PLACE) is called "Sex Frequency in Marriage" was originallly published as "Attitude towards desirability of frequent sex in marriage". Here's a screenshot of original codes, compared to codes in D-PLACE.

code_example

Is the best solution to use a text extractor to get original definitions? (This is what I did for the EA, but will take a while for 1781 variables...) For now, it is up to the user...

xrotwang commented 7 years ago

I guess having this typed in by a student assistant may be quicker than fiddling with OCR and having to cross-check the results.

kirbykat commented 6 years ago

I'm going to work on improving the variable titles/definitions. I'm planning to do it in the csv file because it will be a fairly major overhaul. Any concerns with this, or anyone else working on the variables?

xrotwang commented 6 years ago

If by csv file you mean this https://github.com/D-PLACE/dplace-data/blob/master/datasets/SCCS/variables.csv then no concerns. Not even if someone else would be working on it, because, hey, that's what version control is good for :)

xrotwang commented 5 years ago

@kirbykat has this issue been addressed and solved by now?

xrotwang commented 9 months ago

In the new scheme of things, the place to edit this now is