ror-community / ror-roadmap

Central information about what is happening at ROR and how to contribute feedback
10 stars 1 forks source link

[FEATURE] Publish ROR data dump as CSV (in addition to JSON) #140

Closed lizkrznarich closed 1 year ago

lizkrznarich commented 1 year ago

Describe the problem you would like to solve GRID previously made its data dump available in CSV, as well as JSON and RDF. For those who previously used the GRID CSV, those looking to import ROR data into a SQL database, and those who would like to interact with ROR data without having to parse JSON, it would be useful also publish ROR data dumps in CSV.

Note: This issue covers the CSV portion of the work described on https://github.com/ror-community/ror-api/issues/113 . The RDF portion on issue 113 will be addressed separately.

Describe the solution you'd like Initially, we will produce a single CSV with a subset of fields (those that are not deprecated or empty in all records). The structure of this CSV is different from GRID CSV files; ROR does not use a database for curation, so we will not include a full_tables directory.

id name status types established country.country_name country.country_code addresses.geonames_city.id addresses.geonames_city.name addresses.geonames_city.geonames_admin1.code addresses.geonames_city.geonames_admin1.name aliases (delimited list in one field) acronyms (delimited list in one field) labels (delimited list in one field) links wikipedia_url external_ids (GRID, ISNI, Funder Registry and Wikidata only, as columns with [id name].preferred and [id_name].all as headings) relationships (delimited list in one field)

Who would benefit from this feature? Developers, Librarians, Researchers; others who wish to use ROR data without parsing JSON

Additional information Add any other information or screenshots about the feature request here.

lizkrznarich commented 1 year ago

https://github.com/ror-community/curation_ops/pull/21 https://github.com/ror-community/ror-records/pull/72