Closed CAMOBAP closed 1 year ago
One problem about the iana-registries data set is that these media-types files are not actually XML files or structured data files. Maybe we can rename them into .txt
since they are all in text.
There might be some post-processing we need to do to make the repository useable, e.g. structure them?
@andrew2net it might be easiest to just do 2 here. When we extract the iana-registries information into Relaton, it is a batch process anyway. Isn't it?
The media-types documents were until May 1. I'll check what happend.
Implemented 2.
$ relaton fetch 'IANA media-types'
[relaton-iana] ("IANA media-types") fetching...
[relaton-iana] ("IANA media-types") found IANA media-types
<bibdata type="standard" schema-version="v1.2.3">
<fetched>2023-05-09</fetched>
<title format="text/plain">Media Types</title>
<uri type="src">http://www.iana.org/assignments/media-types</uri>
<docidentifier type="IANA" primary="true">IANA media-types</docidentifier>
<docnumber>media-types</docnumber>
<date type="updated">
<on>2023-05-02</on>
</date>
<contributor>
<role type="publisher"/>
<organization>
<name>Internet Assigned Numbers Authority</name>
<abbreviation>IANA</abbreviation>
</organization>
</contributor>
<language>en</language>
<script>Latn</script>
</bibdata>
Problem
https://github.com/ietf-tools/relaton-data-iana/issues/9
It looks like the problem happens just because https://api.github.com/search/code API works such way out-of-the-box.
It fetches only the first XML from
media-types
:https://raw.githubusercontent.com/ietf-ribose/iana-registries/main/media-types/application/vnd.paos.xml
There is no explicit explanation of this in official docs https://docs.github.com/en/rest/search?apiVersion=2022-11-28
Possible solutions
q
filter to query directories in deep