NIAID-Data-Ecosystem / nde-crawlers

Harvesting infrastructure to collect and standardize dataset and computational tool metadata
Apache License 2.0
0 stars 0 forks source link

Iterative improvement to the OmicsDI crawler #56

Open flaneuse opened 2 years ago

flaneuse commented 2 years ago

Now, it looks like all the distribution objects in OmicsDI link to an .xml file. (example). However, the actual file metadata object provide a list of downloadable files (in this example a .pdf).

In the next run of OmicsDI, it would be good to improve the distribution object to make it easier for the user to find these data files, like:

distribution: [ {
    name: annrheumdis-2012-202031-s1.pdf,
    url: https://www.ebi.ac.uk/biostudies/files/S-EPMC3841758/annrheumdis-2012-202031-s1.pdf,
    encodingFormat: application/pdf
} ]