Open ryjen opened 6 months ago
Describe the bug
Am not 100% sure, but is probably worth looking into whether we are using internet archive meta data correctly.
Internet Archive generates its own metadata about uploads.
Currently the Save app just uploads a json file with our information, which IA also generates metadata for (metadata about metadata)
It looks like there may be a separate endpoint for metadata we can use to update IA's metadata.
Expected behavior
No unnecessary files for metadata
Metadata does not include irrelevant information (for example originalFilePath or progress)
originalFilePath
progress
Examples
Sample IA generated meta data for Save up upload content
<metadata> <identifier>E1j913BVoAMX59e-jpeg-zjpu</identifier> <collection>opensource_media</collection> <language>eng</language> <licenseurl>https://creativecommons.org/licenses/by/4.0/</licenseurl> <mediatype>image</mediatype> <title>E1j913BVoAMX59e.jpeg</title> <uploader>ryan@open-archive.org</uploader> <publicdate>2024-03-25 17:23:58</publicdate> <addeddate>2024-03-25 17:23:58</addeddate> <curation>[curator]validator@archive.org[/curator][date]20240325172447[/date][comment]checked for malware[/comment]</curation> </metadata>
Media File metadata from Save app:
{"author":"","collectionId":6,"contentLength":1099011,"dateCreated":"Mar 25, 2024 10:23:09 AM","description":"","flag":false,"location":"","mediaHash":[],"hash":"15e64238066bfa3ba2a5c88bfcb551ff88278d3543c5f4067d69348b77cd82ee","contentType":"image/jpeg","originalFilePath":"file:///data/user/0/net.opendasharchive.openarchive.release/cache/20240325_102309.E1j913BVoAMX59e.jpeg","priority":0,"progress":0,"projectId":1,"selected":false,"serverUrl":"","status":4,"statusMessage":"","tags":"","originalFileName":"E1j913BVoAMX59e.jpeg","updateDate":"Mar 25, 2024 10:23:09 AM","uploadDate":"Mar 25, 2024 10:23:11 AM","id":6}
<?xml version="1.0" encoding="UTF-8"?> <metadata> <identifier>E1j913BVoAMX59e-jpeg-frih</identifier> <collection>opensource</collection> <language>eng</language> <mediatype>texts</mediatype> <uploader>ryan@open-archive.org</uploader> <title>E1j913BVoAMX59e-jpeg-frih</title> <publicdate>2024-03-25 17:24:09</publicdate> <addeddate>2024-03-25 17:24:09</addeddate> <curation>[curator]validator@archive.org[/curator][date]20240325172944[/date][comment]checked for malware[/comment]</curation> <identifier-access>http://archive.org/details/E1j913BVoAMX59e-jpeg-frih</identifier-access> <identifier-ark>ark:/13960/s243q45kwcz</identifier-ark> </metadata>
Environment (please complete the following information):
Additional context
Describe the bug
Am not 100% sure, but is probably worth looking into whether we are using internet archive meta data correctly.
Internet Archive generates its own metadata about uploads.
Currently the Save app just uploads a json file with our information, which IA also generates metadata for (metadata about metadata)
It looks like there may be a separate endpoint for metadata we can use to update IA's metadata.
Expected behavior
No unnecessary files for metadata
Metadata does not include irrelevant information (for example
originalFilePath
orprogress
)Examples
Sample IA generated meta data for Save up upload content
Media File metadata from Save app:
Environment (please complete the following information):
Additional context