research-software-directory / RSD-as-a-service

This repo contains the new RSD-as-a-service implementation
https://research.software
25 stars 14 forks source link

Scrape organization location #806

Closed ctwhome closed 8 months ago

ctwhome commented 1 year ago

In order to place the organizations on a map (see image), we need a scraper that pulls the location of an organization based on its ROR.

We are aware that ROR only has the coordinates for the city where the organization is based. Which could it be sufficient for now?

Documentation: https://maplibre.org/maplibre-gl-js-docs/example/geojson-markers/

To facilitate the front end, the column "location" inside the table "organisation" could have JSON format:

location: {
  country: "The Netherlands",
  place: "Amsterdam",
  lat: 32.234234234
  long: 4.1223413234
}

image

Example ROR json response

Request

https://api.ror.org/organizations?query=https://ror.org/01ppy6j62

Response

{
    "number_of_results": 1,
    "time_taken": 1,
    "items": [{
        "id": "https://ror.org/01ppy6j62",
        "name": "Geophysics GPR International",
        "email_address": null,
        "ip_addresses": [],
        "established": 1974,
        "types": ["Other"],
        "relationships": [],
        "addresses": [{
            "lat": 45.53121,
            "lng": -73.51806,
            "state": null,
            "state_code": null,
            "city": "Longueuil",
            "geonames_city": {
                "id": 6059891,
                "city": "Longueuil",
                "geonames_admin1": {
                    "name": "Quebec",
                    "id": 6115047,
                    "ascii_name": "Quebec",
                    "code": "CA.10"
                },
                "geonames_admin2": {
                    "name": "Montérégie",
                    "id": 6076966,
                    "ascii_name": "Montérégie",
                    "code": "CA.10.16"
                },
                "license": {
                    "attribution": "Data from geonames.org under a CC-BY 3.0 license",
                    "license": "http://creativecommons.org/licenses/by/3.0/"
                },
                "nuts_level1": {
                    "name": null,
                    "code": null
                },
                "nuts_level2": {
                    "name": null,
                    "code": null
                },
                "nuts_level3": {
                    "name": null,
                    "code": null
                }
            },
            "postcode": null,
            "primary": false,
            "line": null,
            "country_geonames_id": 6251999
        }],
        "links": ["https://geophysicsgpr.com/"],
        "aliases": [],
        "acronyms": [],
        "status": "active",
        "wikipedia_url": "",
        "labels": [{
            "label": "Geophysique GPR International",
            "iso639": "fr"
        }],
        "country": {
            "country_name": "Canada",
            "country_code": "CA"
        },
        "external_ids": {
            "ISNI": {
                "preferred": null,
                "all": ["0000 0004 0471 1571"]
            },
            "GRID": {
                "preferred": "grid.450825.e",
                "all": "grid.450825.e"
            }
        }
    }],
    "meta": {
        "types": [{
            "id": "other",
            "title": "Other",
            "count": 1
        }],
        "countries": [{
            "id": "ca",
            "title": "Canada",
            "count": 1
        }],
        "statuses": [{
            "id": "active",
            "title": "active",
            "count": 1
        }]
    }
}
ctwhome commented 1 year ago

The map could easily look like this:

cmeessen commented 8 months ago

The PR that closed this issue added the country property to the database and also a field in the organisation cards, however, the country is not being scraped currently. I suggest to reopen this issue.