caltechlibrary / caltechdata_api

Python library for using the CaltechDATA API
Other
10 stars 2 forks source link

cannot edit entry #13

Closed caseyjlaw closed 1 year ago

caseyjlaw commented 1 year ago

I am using the caltechdata_write and caltechdata_edit functions to create an entry, add an DOI, and upload files. The steps (as we've discussed) are broken into two parts (create then edit):

metadata = caltechdata.create_ctd(triggerfile=triggerfile, production=production, getdoi=getdoi, version=version)
caltechdata.edit_ctd(metadata, files=files, production=production)  # publishes by default                                            

I tested this with the latest release and get an error about the DOI not being registered correctly. Logging by my script and error:

Created unpublished Caltech Data entry at https://data.caltech.edu/uploads/3ckt0-1as77
Created DOI to point to published location at https://data.caltech.edu/records/3ckt0-1as77
Created metadata from 221025aanu.json with doi 10.25800/jfaw-b122
Saving metadata_221025aanu.json
Got idv 3ckt0-1as77 from metadata
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/casa38/bin/dsaevent", line 8, in <module>
    sys.exit(cli())
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/event/cli.py", line 37, in ctd_send
    caltechdata.edit_ctd(metadata, files=files, production=production)  # publishes by default
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/event/caltechdata.py", line 80, in edit_ctd
    caltechdata_edit(ids=idv, token=token, metadata=metadata, files=files, production=production, publish=publish)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/caltechdata_api/caltechdata_edit.py", line 107, in caltechdata_edit
    raise Exception(result.text)
Exception: {"status": 404, "message": "The persistent identifier is not registered."}
(

I confirmed that the page at https://data.caltech.edu/uploads/3ckt0-1as77 exists and that the doi is registered. Any idea what might be going wrong?

caseyjlaw commented 1 year ago

I had not tested for a while, but I had something working a month or two ago. I'm not sure this is related to the latest release (as of a few days ago). I confirmed that it is using version 1.0.0.

tmorrell commented 1 year ago

Could you send over the metadata you're uploading when you're editing the record? The DOI processing steps can are a bit tricky, so I just need to work through your workflow.

caseyjlaw commented 1 year ago

Ok, thanks. I put the file at https://www.dropbox.com/s/irbolm3yx8oj8o2/metadata_221025aanu.json?dl=0.

tmorrell commented 1 year ago

I've improved the DOI handling, so please download and use v1.1.0.

There are also some metadata changes you should make. You should put all the identifiers together with the following labels:

"identifiers": [
        {
            "identifier": "10.25800/jfaw-b122",
            "identifierType": "DOI"
        },
        {
            "identifier": "221025aanu",
            "identifierType": "dsa-110-id"
        },
        {
            "identifier": "3ckt0-1as77",
            "identifierType": "cdid"
        }
    ],

And reference the license as "cc-by-4.0"

Let me know if you run into any other issues! The test CaltechDATA instance is also available at data.caltechlibrary.dev if you prefer to test there.

caseyjlaw commented 1 year ago

I've updated to version 1.1.0. I also updated the format of my identifier field. However, I'm having some problem with the getting "publish" to work. This is a problem both on the test and production systems. Here's an example error message:

> dsaevent ctd-send --files ../other/221018aaaj.png --getdoi 221018aaaj.json --production
Created unpublished Caltech Data entry at https://data.caltech.edu/uploads/586dj-mej69
Created DOI to point to published location at https://data.caltech.edu/records/586dj-mej69
Got doi 10.25800/c238-2y76 from metadata
Created metadata from 221018aaaj.json with doi 10.25800/c238-2y76
Saving metadata_221018aaaj.json
Got idv 586dj-mej69 from metadata
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/casa38/bin/dsaevent", line 8, in <module>
    sys.exit(cli())
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/event/cli.py", line 41, in ctd_send
    caltechdata.edit_ctd(metadata, files=files, production=production)  # publishes by default
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/event/caltechdata.py", line 94, in edit_ctd
    caltechdata_edit(idv=idv, token=token, metadata=metadata, files=files, production=production, publish=publish)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/caltechdata_api/caltechdata_edit.py", line 149, in caltechdata_edit
    raise Exception(result.text)
Exception: {"status": 404, "message": "The persistent identifier is not registered."}

The upload is https://data.caltech.edu/uploads/586dj-mej69. The metadata json that I save after the initial creation is at https://www.dropbox.com/s/20us030aazq8da5/metadata_221018aaaj.json?dl=0.

Any ideas what causes the persistent identifier error?

tmorrell commented 1 year ago

Apologies for the delay in looking at this. I just released v1.1.1 that might help your issue.

That metadata file seems to be the old format and doesn't match with the record, so I haven't been able to reproduce the bug you've found One additional possible cause is if you have reused any of the persistent identifiers in your testing. So if you have re-used a DOI or DSA-110 id on another record the system won't let you create another record with the same id.

If the new version or repeated identifiers are not the issue, upload the metadata files I can try to reproduce the issue and figure it out.

caseyjlaw commented 1 year ago

I updated to version v1.1.1 and am still seeing the same error. I am using the development system, but some of the logging below refers to the production system.

Detailed log from my script:

dsaevent ctd-send --getdoi --files 221018aaaj.png 221018aaaj.json 
Created unpublished Caltech Data entry at https://data.caltech.edu/uploads/sr0he-w6t15
Created DOI to point to published location at https://data.caltech.edu/records/sr0he-w6t15
Got doi 10.22013/90q2-rw40 from metadata
Created metadata from 221018aaaj.json with doi 10.22013/90q2-rw40
Saving metadata_221018aaaj.json
Got idv sr0he-w6t15 from metadata
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/casa38/bin/dsaevent", line 8, in <module>
    sys.exit(cli())
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/event/cli.py", line 41, in ctd_send
    caltechdata.edit_ctd(metadata, files=files, production=production)  # publishes by default
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/event/caltechdata.py", line 94, in edit_ctd
    caltechdata_edit(idv=idv, token=token, metadata=metadata, files=files, production=production, publish=publish)
  File "/home/ubuntu/anaconda3/envs/casa38/lib/python3.8/site-packages/caltechdata_api/caltechdata_edit.py", line 149, in caltechdata_edit
    raise Exception(result.text)
Exception: {"status": 404, "message": "The persistent identifier is not registered."}
tmorrell commented 1 year ago

I've finally been able to reproduce the issue, and I've fixed it in v1.2.0. Let me know if you run into any more problems!

caseyjlaw commented 1 year ago

Great! I tested the new version on the development platform and it works as expected.