oxinabox / DataDepsGenerators.jl

Utility for developers to help define DataDeps registration blocks, for reusing existing Data with DataDeps.jl
Other
18 stars 6 forks source link

Handle not having a list of URLS but just one (support PANGAEA) #77

Closed oxinabox closed 4 years ago

oxinabox commented 4 years ago

Closes #76 @briochemc

I thought we should already havee PANGAEA support form one of the generic supports we have, and we almost did. It has JSON-LD, but its not quite following the structure we expected.

Demonstration:

julia> generate("https://doi.pangaea.de/10.1594/PANGAEA.921913") |> println
register(DataDep(
    "Benthic foraminiferal stable isotopes, planktonic foraminiferal biostratigraphy and seismic profiles for IODP Site 356-U1463",
    """
        Dataset: Benthic foraminiferal stable isotopes, planktonic foraminiferal biostratigraphy and seismic profiles for IODP Site 356-U1463
        Website: https://doi.pangaea.de/10.1594/PANGAEA.921913
        Author: Jeroen Groeneveld et al.
        Date of Publication: August 24, 2020

        Determining the age of marine sediments is essential to reconstruct past changes in oceanography and climate. The oxygen isotopes of benthic foraminifera record long-term changes in global ice volume and deep-water temperature, and are commonly used to construct age models. However, continental margin settings often display much higher sedimentation rates due to regional input by rivers. Here, it is necessary to create a regional framework to allow precise dating of strata. We created such a framework for the Northwest Shelf (NWS) of Australia, which was cored by International Ocean Discovery Program (IODP) Expedition 356. We used oxygen and carbon isotopes in benthic foraminifera to construct an astronomically-tuned age model for IODP Site U1463. The natural gamma radiation (NGR) variations for IODP Site U1463 were then correlated to those of other IODP sites and industry wells in the area. […]
        """,
        ["https://doi.pangaea.de/10.1594/PANGAEA.921913?format=zip"],
))
oxinabox commented 4 years ago

The CI is mega broken on this. Partically because it relies on external sites not changing their content. but also because it relies on them not changing there API. The package is probably broken in inumerable ways. With this PR is is less broken. and overall it is kinda robust because of it using multiple sources of data. So probably still usable. but the package could do with a lot of maintanance, that i don't have time or interest in doing right now.

It works

briochemc commented 4 years ago

Thanks for this!

I understand the struggle for time... What about moving these packages into an organization? Might help find someone with time to spend? (Maybe JuliaData?)

oxinabox commented 4 years ago

I don't think so, thing is this is a tool that more people will use only a few times per year. By design people don't use it directly in packages, just the output code. and the output code is much more robust than the package itself. So its not particularly attactive to maintainers. Everyone should have something better to do than work on fixing this package's fragile tests. The package basically works.