ror-community / ror-roadmap

Central information about what is happening at ROR and how to contribute feedback
10 stars 2 forks source link

[FEATURE] Provide an alias-to-ROR data dump and/or directions for creating one #230

Open amandafrench opened 8 months ago

amandafrench commented 8 months ago

Describe the problem you would like to solve A user asks: "At the moment there's one ROR to many aliases in a single field but it would be helpful to have a data dump which lists each alias with the relevant ROR. We would find this useful (less to convert in code)."

Describe the solution you'd like A data dump with every label, acronym, and alias in a single field mapped to its corresponding ROR

Who would benefit from this feature? Developers integrating ROR

Additional information

JohnCambridge commented 8 months ago

We use ROR as part of a manuscript affiliation matching process and one of the things we are discussing internally is how we enrich the ROR API/data dump data to make the machine match more accurate. It would be wonderful to have a data dump option, which listed the ROR against each alias as per the crude example attached (which is only to illustrate the RORs repeating per alias listed and is not meant to infer a layout). Example

amandafrench commented 8 months ago

Interesting, @JohnCambridge, thanks so much -- helps to know that you want it for name matching. You've probably seen it, but just in case you haven't, we do have a guide on various methods of matching names to ROR IDs using various tools, including some third-party tools where you can see the code: https://ror.readme.io/docs/matching

adambuttrick commented 4 months ago

@JohnCambridge While we would need additional users requesting to add this to our data dump or to create a separate file, I've written a script to create to create this CSV file conversion:

https://github.com/ror-community/curation_ops/tree/main/utilities/convert_data_dump_all_names_csv

This has no external dependencies and can take as input either the data dump zip file or the schema v2.0 JSON file. Hope this helps!

JohnCambridge commented 4 months ago

@JohnCambridge While we would need additional users requesting to add this to our data dump or to create a separate file, I've written a script to create to create this CSV file conversion:

https://github.com/ror-community/curation_ops/tree/main/utilities/convert_data_dump_all_names_csv

This has no external dependencies and can take as input either the data dump zip file or the schema v2.0 JSON file. Hope this helps!

Hi Adam,

Thank you so much. Let me take a look at this and get one of my Data Analysts to see how this can be incorporated/adapted into our data procedures for ROR extract. Thanks for taking the time! John