D-PLACE / dplace-data

The data repository for the D-PLACE Project (Database of Places, Language, Culture and Environment)
https://d-place.org
Creative Commons Attribution 4.0 International
77 stars 37 forks source link

Merge references #75

Closed xrotwang closed 7 years ago

xrotwang commented 7 years ago

References from all datasets should be merged, and the references field in data.csv should be enhanced to also include pages, e.g.

"Memmott, 1983a:123-234; Memmott, 1983b"

i.e. using syntax like semikolon-separated reference keys of the form <key>:<pages>.

Parsing of this syntax should be done when initializing a Data object.

xrotwang commented 7 years ago

@kirbykat @SimonGreenhill I was looking at the EA references and was wondering, whether we should merge some of these, extracting the pages info into data.csv (as explained above). E.g. for the following batch of references, page numbers are the only difference between the bibliographical records:

"Murdock, 1934b",True,"Murdock, G. P. 1934. Our Primitive Contemporaries, pp. 107-154. New York.",,^M
"Murdock, 1934c",True,"Murdock, G. P. 1934. Our Primitive Contemporaries, pp. 135-162. New York.",,^M
"Murdock, 1934d",True,"Murdock, G. P. 1934. Our Primitive Contemporaries, pp. 221-263. New York.",,^M
"Murdock, 1934e",True,"Murdock, G. P. 1934. Our Primitive Contemporaries, pp. 359-402. New York.",,^M
"Murdock, 1934f",True,"Murdock, G. P. 1934. Our Primitive Contemporaries, pp. 475-507. New York.",,^M
"Murdock, 1934g",True,"Murdock, G. P. 1934. Our Primitive Contemporaries, pp. 551-595. New York.",,^M
"Murdock, 1934h",True,"Murdock, G. P. 1934. Our Primitive Contemporaries, pp. 85-106. New York.",,^M
"Murdock, 1934i",True,"Murdock, G. P. 1934. Our Primitive Contemporaries. pp. 403-450. New York.",,^M

Merging this into a single record Murdock, 1934 referenced in data.csv as Murdock, 1934:107-154, etc. would make sense, no?

If OTOH the referenced ranges of pages correspond to a book chapter with a separate title, we may alternatively want to include this info into the reference BibTeX.

kirbykat commented 7 years ago

@xrotwang Yes, I think what you suggest makes sense. Also agree that when page numbers delineate a specific book chapter, best to incorporate to BibTeX.

kirbykat commented 7 years ago

@xrotwang @SimonGreenhill I've just finished extracting and doing a preliminary clean on the ~300 WNAI references. I'm attaching them here in case they can still be incorporated into the current reference cleaning/merging effort. These will be linked to the WNAI data, which I'm also still wrangling.

By the way, this is the last reference file - the SCCS references are already part of the EA reference set.

The first column is a unique id/look-up column, from which first author last name and year should be easy to extract.

WNAI_sources_3April2017_utf8.zip

-k

SimonGreenhill commented 7 years ago

Closed as I think this is fixed, and we can treat the WNAI as a whole new set of problems.