QGreenland-Net / .github

The front-page README and catch-all for issues that don't have another home.
0 stars 0 forks source link

Familiarize with DataONE API #1

Open mfisher87 opened 3 months ago

mfisher87 commented 3 months ago

Prior work: https://github.com/nsidc/dataone_exploration


@rushirajnenuji provided this information:

We use solr index to store and search the metadata related to datasets. The UI display is mostly driven by this solr index --

Solr Index

If you'd like to see the query-able fields - here is the schema supported fields: ADC MN solr and entire DataONE corpus solr Sample query for 2nd use case:

What is the breakdown of all data file types w/in our area of interest? E.g., how many csv, geotiff, etc. I'd like a list of all unique file types. I see there are some basic metrics on our data portal page, but it bins some things into "other". Query%20AND%20(-obsoletedBy:%20AND%20formatType:METADATA%20AND%20-formatId:(dataone.org%5C%2Fcollections%20OR%20dataone.org%5C%2Fportals))&fq=formatType:DATA%20AND%20-obsoletedBy:&facet=true&facet.field=formatId&facet.limit=-1&f.formatId.facet.mincount=1&f.formatId.facet.missing=false&wt=json&rows=0) Query params (key:value)

q:{!join%20from=resourceMap%20to=resourceMap}((siteText:Greenland)%20OR%20isPartOf:%22urn%5C%3Auuid%5C%3Adfbfc37d%5C-e907%5C-45e8%5C-913f%5C-acb88f1b2e64%22)%20AND%20(-obsoletedBy:*%20AND%20formatType:METADATA%20AND%20-formatId:(*dataone.org%5C%2Fcollections*%20OR%20*dataone.org%5C%2Fportals*))
fq:formatType:DATA%20AND%20-obsoletedBy:*
facet:true
facet.field:formatId
facet.limit:-1
f.formatId.facet.mincount:1
f.formatId.facet.missing:false
wt:json
rows:0

Response:

{
    "responseHeader": {
        "status": 0,
        "QTime": 91,
        "params": {
            "q": "{!join from=resourceMap to=resourceMap}((siteText:Greenland) OR isPartOf:\"urn\\:uuid\\:dfbfc37d\\-e907\\-45e8\\-913f\\-acb88f1b2e64\") AND (-obsoletedBy:* AND formatType:METADATA AND -formatId:(*dataone.org\/collections* OR *dataone.org\/portals*))",
            "facet.limit": "-1",
            "facet.field": "formatId",
            "f.formatId.facet.mincount": "1",
            "f.formatId.facet.missing": "false",
            "fq": [
                "formatType:DATA AND -obsoletedBy:*",
                "(readPermission:\"public\")OR(writePermission:\"public\")OR(changePermission:\"public\")OR(isPublic:true)"
            ],
            "rows": "0",
            "facet": "true",
            "wt": "javabin",
            "version": "2"
        }
    },
    "response": {
        "numFound": 103788,
        "start": 0,
        "numFoundExact": true,
        "docs": []
    },
    "facet_counts": {
        "facet_queries": {},
        "facet_fields": {
            "formatId": [
                "CF-1.4",
                335,
                "application/MATLAB",
                3,
                "application/MATLAB-v7.3",
                19,
                "application/R",
                3,
                "application/msword",
                60,
                "application/octet-stream",
                15298,
                "application/pdf",
                288,
                "application/rtf",
                2,
                "application/vnd.google-earth.kml+xml",
                160,
                "application/vnd.ms-excel",
                27,
                "application/vnd.openxmlformats-officedocument.presentationml.presentation",
                1,
                "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
                82,
                "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
                33,
                "application/vnd.shp+zip",
                6,
                "application/x-gzip",
                13,
                "application/x-hdf5",
                1,
                "application/x-python",
                1,
                "application/x-rar-compressed",
                1,
                "application/x-tar",
                1444,
                "application/zip",
                224,
                "audio/x-wav",
                32713,
                "image/bmp",
                2,
                "image/geotiff",
                24,
                "image/jpeg",
                1,
                "image/png",
                135,
                "image/tiff",
                71,
                "netCDF-3",
                7912,
                "netCDF-4",
                42219,
                "text/csv",
                2187,
                "text/html",
                1,
                "text/markdown",
                1,
                "text/plain",
                457,
                "text/tsv",
                50,
                "text/x-rmarkdown",
                4,
                "text/xml",
                2,
                "video/mp4",
                2,
                "video/quicktime",
                6
            ]
        },
        "facet_ranges": {},
        "facet_intervals": {},
        "facet_heatmaps": {}
    }
}
trey-stafford commented 3 months ago

GH Gist on how to upload data packages to DataONE: https://gist.github.com/csjx/863bf722590f59663c55043b326f803f