globalbioticinteractions / msb-para

0 stars 0 forks source link

specify dependencies on msb-host and other dwca records #1

Closed jhpoelen closed 3 years ago

jhpoelen commented 3 years ago

the Museum of Southern Biology (MSB) Parasite collection references records from other collections like the Museum of Southern Biology (MSB) Host collection.

However, these dependencies are not explicitly declared, so GloBI doesn't understand how to resolve them. Note that https://github.com/globalbioticinteractions/vertnet implements a way to define a group of potentially related datasets, but does not explicitly define which datasets depends on another.

Suggest to implement a way to explicitly configure these (and other) dataset dependencies as part of the index configuration.

related issues: https://github.com/globalbioticinteractions/globi-taxon-names/issues/7 https://github.com/globalbioticinteractions/globalbioticinteractions/issues/659 .

fyi @arw36

jhpoelen commented 3 years ago

MSB Para dependencies have been declared using customized RSS categories as documented in

https://github.com/globalbioticinteractions/globalbioticinteractions/releases/tag/v0.21.1 -

<?xml version="1.0"?>
<rss version="2.0"
... 
   <channel>
        <title>MSB Parasite Collection and Dependencies RSS</title>
...
     <item>
            <title>MSB Parasite Collection (Arctos)</title>
            ...
            <!-- this is the main dataset to be indexed -->
            <ipt:dwca>http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_para</ipt:dwca>
     </item>
     <item>
            <title>MSB Host Collection (Arctos)</title>
            ...
            <!-- this is a dataset that msb para depends on (e.g., references one or more of their occurrence records) -->
            <ipt:dwca>http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_host</ipt:dwca>
           ...
            <!-- category used to indicate that an item is a dependency and should used to resolve linked content, and not as a source of interaction data -->
            <category domain="http://www.w3.org/ns/prov">http://www.w3.org/ns/prov#wasUsedBy</category>
        </item>
   ...   

with working examples https://github.com/globalbioticinteractions/globalbioticinteractions/issues/659 .

For actual configuration, see https://github.com/globalbioticinteractions/msb-para/blob/a60e2670eebe109893d1d9f336aa4203fc80cf63/globi.json and https://github.com/globalbioticinteractions/msb-para/blob/a60e2670eebe109893d1d9f336aa4203fc80cf63/rss.xml .