Closed kcondon closed 7 years ago
Relates also to #889 and #886
I would like this too.
CNRI announced that
In the new management structure under DONA, both the GHR-specific software and the Handle.Net software will be substantially updated.
Hopefully that would make this easier to implement.
@eaquigley @mcrosas @scolapasta Hi Guys: ticket came in this morning that handles are not going directly to a dataset, but rather a list of recently published data: example: http://hdl.handle.net/1902.1/21720 If you click on that as a link, you don't go directly to the dataset Does this have to do with this registration issue or should I create a new ticket?
If you input the actual handle in the form at http://hdl.handle.net and check the box "don't redirect to URLs", you'll see that the URL for the handle is http://thedata.harvard.edu/dvn/study?globalId=hdl:1902.1/21720
I think the redirects performed by thedata.harvard.edu are incorrect. But the handle records should really be updated to point at the current URLs.
@bencomp thanks! @scolapasta does this need a new ticket in github?
@bencomp thanks! I didn't know how to troubleshoot handles like this. Very handy to see this:
We had confirmed this was a redirect/rewrite issue earlier but have not yet had a chance to correct it.
From: sbarbosadataverse [notifications@github.com] Sent: Wednesday, December 02, 2015 10:45 AM To: IQSS/dataverse Cc: Condon, Kevin Subject: Re: [dataverse] Handles: Restore handle registration functionality in 4.x (#2437)
@eaquigleyhttps://urldefense.proofpoint.com/v2/url?u=https-3Agithub.com_eaquigley&d=CwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=TUpjWt9sVfaAC8ETCY_cDPtqJKl7s242PLg6-Wx6UpM&m=ctcl5AjTDx2JeHE37eFpQqPHgG6E1FiXTJCjxe9kpg0&s=RaHaT24lFmOw3dc7l1KDkDfJL7_1WkvEC1y8ppWq-ho&e= @mcrosashttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_mcrosas&d=CwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=TUpjWt9sVfaAC8ETCY_cDPtqJKl7s242PLg6-Wx6UpM&m=ctcl5AjTDx2JeHE37eFpQqPHgG6E1FiXTJCjxe9kpg0&s=q4NKGBWRq2Z9q-tQ6aRuN2iPsZYk1NoGrxvMnXXE_LM&e= @scolapastahttps://urldefense.proofpoint.com/v2/url?u=https-3Agithub.com_scolapasta&d=CwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=TUpjWt9sVfaAC8ETCY_cDPtqJKl7s242PLg6-Wx6UpM&m=ctcl5AjTDx2JeHE37eFpQqPHgG6E1FiXTJCjxe9kpg0&s=Hs_VMuMi-I3hUxIoZ2w423CCG51z69CW9Xdzt0zXmJI&e= Hi Guys: ticket came in this morning that handles are not going directly to a dataset, but rather a list of recently published data: example: http://hdl.handle.net/1902.1/21720https://urldefense.proofpoint.com/v2/url?u=http-3A__hdl.handle.net_1902.1_21720&d=CwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=TUpjWt9sVfaAC8ETCY_cDPtqJKl7s242PLg6-Wx6UpM&m=ctcl5AjTDx2JeHE37eFpQqPHgG6E1FiXTJCjxe9kpg0&s=bGGyuXm8H8LSD8iTMKdEtFNkZ0tyPzeBv4l-O5PGyts&e= If you click on that as a link, you don't go directly to the dataset Does this have to do with this registration issue or should I create a new ticket?
— Reply to this email directly or view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_IQSS_dataverse_issues_2437-23issuecomment-2D161339248&d=CwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=TUpjWt9sVfaAC8ETCY_cDPtqJKl7s242PLg6-Wx6UpM&m=ctcl5AjTDx2JeHE37eFpQqPHgG6E1FiXTJCjxe9kpg0&s=H_d1smleJnC1eXl3BkSImeyHwhi4j3Wqskz0WVAgZ2s&e=.
Note this is also affecting Gary's datasets: https://dataverse.harvard.edu/?globalId=hdl:1902.1/11193
On Wed, Dec 2, 2015 at 11:44 AM, kcondon notifications@github.com wrote:
We had confirmed this was a redirect/rewrite issue earlier but have not yet had a chance to correct it.
From: sbarbosadataverse [notifications@github.com] Sent: Wednesday, December 02, 2015 10:45 AM To: IQSS/dataverse Cc: Condon, Kevin Subject: Re: [dataverse] Handles: Restore handle registration functionality in 4.x (#2437)
@eaquigley< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_eaquigley&d=CwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=TUpjWt9sVfaAC8ETCY_cDPtqJKl7s242PLg6-Wx6UpM&m=ctcl5AjTDx2JeHE37eFpQqPHgG6E1FiXTJCjxe9kpg0&s=RaHaT24lFmOw3dc7l1KDkDfJL7_1WkvEC1y8ppWq-ho&e=> @mcrosas< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_mcrosas&d=CwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=TUpjWt9sVfaAC8ETCY_cDPtqJKl7s242PLg6-Wx6UpM&m=ctcl5AjTDx2JeHE37eFpQqPHgG6E1FiXTJCjxe9kpg0&s=q4NKGBWRq2Z9q-tQ6aRuN2iPsZYk1NoGrxvMnXXE_LM&e=> @scolapasta< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_scolapasta&d=CwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=TUpjWt9sVfaAC8ETCY_cDPtqJKl7s242PLg6-Wx6UpM&m=ctcl5AjTDx2JeHE37eFpQqPHgG6E1FiXTJCjxe9kpg0&s=Hs_VMuMi-I3hUxIoZ2w423CCG51z69CW9Xdzt0zXmJI&e=
Hi Guys: ticket came in this morning that handles are not going directly to a dataset, but rather a list of recently published data: example: http://hdl.handle.net/1902.1/21720< https://urldefense.proofpoint.com/v2/url?u=http-3A__hdl.handle.net_1902.1_21720&d=CwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=TUpjWt9sVfaAC8ETCY_cDPtqJKl7s242PLg6-Wx6UpM&m=ctcl5AjTDx2JeHE37eFpQqPHgG6E1FiXTJCjxe9kpg0&s=bGGyuXm8H8LSD8iTMKdEtFNkZ0tyPzeBv4l-O5PGyts&e=
If you click on that as a link, you don't go directly to the dataset Does this have to do with this registration issue or should I create a new ticket?
— Reply to this email directly or view it on GitHub< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_IQSS_dataverse_issues_2437-23issuecomment-2D161339248&d=CwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=TUpjWt9sVfaAC8ETCY_cDPtqJKl7s242PLg6-Wx6UpM&m=ctcl5AjTDx2JeHE37eFpQqPHgG6E1FiXTJCjxe9kpg0&s=H_d1smleJnC1eXl3BkSImeyHwhi4j3Wqskz0WVAgZ2s&e=
.
— Reply to this email directly or view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_IQSS_dataverse_issues_2437-23issuecomment-2D161360004&d=CwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=OiTcIL_jSSCUW82Gf-OSkgnZLbG2Yt87eT87BdLiP54&m=XzfSWq6cKk9niMN9IEnzAwbpfvloCy_uW-Sg4z8v-8I&s=pZ769EtAUlGkZxvjEl1AYFWUWsmZ58-juV0C6hoHaFA&e= .
Eleni Castro Research Coordinator, Data Curation and Outreach IQSS, Harvard University 617-496-0703 http://www.iq.harvard.edu/people/eleni-castro http://orcid.org/0000-0001-9767-8536
~Got Data? Check out the Dataverse Project. http://dataverse.org/~
@kcondon redirect/rewrite might be a temporary solution, but you should really update the Handle records to point at the correct URLs.
Thank Ben, I'm fully aware of the options and am pursuing those according to my own best judgement.
From: bencomp [notifications@github.com] Sent: Wednesday, December 09, 2015 3:51 AM To: IQSS/dataverse Cc: Condon, Kevin M Subject: Re: [dataverse] Handles: Restore handle registration functionality in 4.x (#2437)
@kcondonhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_kcondon&d=CwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=TUpjWt9sVfaAC8ETCY_cDPtqJKl7s242PLg6-Wx6UpM&m=dLbYdRbGQ6a4xV_6HI7v_wgNV6iOmo5uzruH5HwPU-k&s=-l2UObjmiGaHxa9zYtb-zxKQgTvPuKDXH-vFBc_j0E4&e= redirect/rewrite might be a temporary solution, but you should really update the Handle records to point at the correct URLs.
— Reply to this email directly or view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_IQSS_dataverse_issues_2437-23issuecomment-2D163153034&d=CwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=TUpjWt9sVfaAC8ETCY_cDPtqJKl7s242PLg6-Wx6UpM&m=dLbYdRbGQ6a4xV_6HI7v_wgNV6iOmo5uzruH5HwPU-k&s=5G67Z228fKVxwcx8BBaSmgMDIwwEmkkOjhv73CAmlT4&e=.
So an update for all the watchers: -We resolved the first issue from a week ago by updating the handle record to use the correct URL. -We had intended to update all handle records to use the correct URLs after we changed URLs but this seems to not have been done or not done completely so a review needs to be performed. #2809 -There is also a rewrite issue that makes the general problem visible as a result of moving the rewrite rules from Glassfish to Apache. These rules need to be retested and corrected. #2810 -We are currently trying to get a release out so have been fixing the immediate issue and working on addressing the larger issues as soon as we have time.
Dear all, is there any news about this issue ?
Kind regards, Ivo
This appears to be a duplicate of #889 which says "This is a parent ticket for the various handle/doi and handle/doi registration tickets." We probably only need one parent ticket/issue.
This morning @djbrooke and I discussed @jo-pol 's pull request at #3146 and decided to use this issue to represent this work https://waffle.io/IQSS/dataverse so I put this issue in the "Code Review" column. I would guess that @sekmiller is in the best position to review the pull request.
Just a heads up that @solhm ("teya") in IRC is interested in Handle support as well: http://irclog.iq.harvard.edu/dataverse/2016-10-26#i_43738
OK, tested and found a few issues:
I believe the intention is to have a generic message with a variable for the registration service, in this case it would be Handle.Net rather than DataCite. @scolapasta would have more details.
@scolapasta @landreev @sekmiller @djbrooke would have more information and guidance on these technical issues.
Thanks @kcondon!
I'll tag Jo in the PR to see how we should proceed.
@ekoi (cc @4tikhonov) could you look into these issues too?
Hi @ekoi @4tikhonov and @jo-pol - let me know if there's anything that we can do to help move this along. Thanks again for working on this valuable feature!
Hi @kcondon and @djbrooke, how it was tested? Do you have some unit tests for this functionality or it was tested manually?
@4tikhonov It was tested manually. I'll have a list shortly. Here are my tests: https://docs.google.com/document/d/1_HdMUKTBV-p5GU-31uZ7BmQYxWs7-lsvAVsWr150mmU/edit?usp=sharing
Let us know if you have any questions, thanks.
Hey @4tikhonov - let me know if there's anything that we can do to help this move forward. If possible, we'd like to include it in the 4.6 release, which we're planning on completing before the end of the year. Thanks!
Thanks @kcondon for linking up the test cases here!
I wished I had the document with test scenarios and expected result back in June while I still had a chance to work on this issue. Now I'm limited to keep an eye on progress on either side of the ocean. Just some questions that cross my mind:
The document says "Handle creation workflow is more like EZID"
Hey @jo-pol - thank you for everything that you and other folks at DANS have done to move handles along. I think we're very close. I understand that there are a few lingering issues and we'll work with you guys to fix them in a way that works with the time that both of our groups have available. I'll work with @4tikhonov and @scolapasta to determine the specifics of moving forward with Handles, as I understand that the DANS team is focused on the move to Dataverse 4 (which is great!!).
Remaining items (this is a restatement of Kevin's comment above):
Note that these items apply to Handles only. There are slightly separate flows for EZID and Datacite persistent IDs.
@sekmiller, @landreev, or @scolapasta:
@solhm is working on this these days and had some follow up questions. I'm adding them here in italics below. Can someone please provide some clarification so that he can keep this moving? Thanks!
Other than those issues. at what point is the datasets pushed into the handle.net server or get public? I can see publicizeIdentifier function for doi on both providers EZID and DataCite but I can't find a related function for hdl protocol.
One key thing to understand for working on this functionality is that the general idea is to move logic specific to a registration provider outside of the commands (or other areas of core code) and move into provider specific implementations of the registration provider interface, PersistentIdRegistrationServiceBean. In this way someone should be able to add a brand new provider, as needed, and NOT touch any of the core dataverse code. Much of this was already done by the external partner who started this branch, but it may not be complete.
So if you see any place in the core code that references data cite, ezid, or handle, that needs to change. For example, when you say "This error came from PublishDatasetCommand but it first checks if the protocol is DOI" that immediately raised the flag to me that this was not completed here.
Note that some providers are not as robust as others, so some of the impementation methods will be empty.
I'll review the code tomorrow with our developers to answer your specific questions, but I hope that this general guideline can already help you make some progress.
Actually, I can answer the last one (5) without reviewing the code. You are correct that when deaccessioning we do NOT delete the persistent identifier. It is important that the persistent identifier still exist, in order to point to a tombstone page (explaining why the dataset was deaccessioned).
However, "destroy" is a special function that completely removes the dataset from dataverse. There is no longer a tombstone page. For example, we use this to remove test datasets, or datasets that should never have been created. In this case, we do want to delete the persistent identifier.
To answer 4, the only other place we can think of is in the directory structure for the files, but that is deleted when the draft is deleted.
@scolapasta very true, core code shouldn't be explicit, for question 5 "destroy" you right the PId should also be removed And on Answer 4- in that case, its working fine.
So if you see any place in the core code that references data cite, ezid, or handle, that needs to change. For example, when you say "This error came from PublishDatasetCommand but it first checks if the protocol is DOI" that immediately raised the flag to me that this was not completed here.
@scolapasta That's correct but even in the master and v4.6.1 branches if you check PublishDatasetCommand as an example. dataCite and ezid is all over the code. In fact, like you suggested me, I used DANS-KNAW/dataver. But since 4.6.1 in release what is the logical explanation.
I'm sure they are all over the code in both master and 4.6.1 (which should currently be the same). The idea of moving logic into provider specific implementations of the PersistentIdRegistrationServiceBean isn't yet in our production code and is what branch you are working on is all about. That is what we want to get into the core code ASAP.
@sekmiller and I just made a new branch at https://github.com/IQSS/dataverse/tree/2437-handle-support based on https://github.com/DANS-KNAW/dataverse/tree/dans-master
It's quite a bit behind the IQSS "develop" branch. Here's a preview of the merge conflicts that need to be resolved:
murphy:dataverse pdurbin$ git merge develop
Removing src/test/resources/json/complete-dataverse.json
Auto-merging src/main/webapp/resources/js/shib/idpselect_config.js
CONFLICT (content): Merge conflict in src/main/webapp/resources/js/shib/idpselect_config.js
Auto-merging src/main/webapp/resources/js/shib/idpselect.js
CONFLICT (content): Merge conflict in src/main/webapp/resources/js/shib/idpselect.js
Removing src/main/webapp/mydata_templates/cards.html
Auto-merging src/main/webapp/loginpage.xhtml
CONFLICT (content): Merge conflict in src/main/webapp/loginpage.xhtml
Removing src/main/java/edu/harvard/iq/dataverse/util/MD5Checksum.java
Auto-merging src/main/java/edu/harvard/iq/dataverse/harvest/server/xoai/XsetRepository.java
Auto-merging src/main/java/edu/harvard/iq/dataverse/harvest/server/xoai/XitemRepository.java
Auto-merging src/main/java/edu/harvard/iq/dataverse/harvest/server/xoai/Xitem.java
Removing src/main/java/edu/harvard/iq/dataverse/harvest/server/web/xMetadata.java
Auto-merging src/main/java/edu/harvard/iq/dataverse/engine/command/impl/UpdateDatasetCommand.java
CONFLICT (content): Merge conflict in src/main/java/edu/harvard/iq/dataverse/engine/command/impl/UpdateDatasetCommand.java
Auto-merging src/main/java/edu/harvard/iq/dataverse/engine/command/impl/PublishDatasetCommand.java
CONFLICT (content): Merge conflict in src/main/java/edu/harvard/iq/dataverse/engine/command/impl/PublishDatasetCommand.java
Auto-merging src/main/java/edu/harvard/iq/dataverse/engine/command/impl/DestroyDatasetCommand.java
Auto-merging src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDatasetCommand.java
CONFLICT (content): Merge conflict in src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDatasetCommand.java
Auto-merging src/main/java/edu/harvard/iq/dataverse/dataaccess/DataConverter.java
Auto-merging src/main/java/edu/harvard/iq/dataverse/authorization/providers/builtin/DataverseUserPage.java
Removing src/main/java/edu/harvard/iq/dataverse/authorization/UserLister.java
Removing src/main/java/edu/harvard/iq/dataverse/authorization/MyDataQueryHelperServiceBean.java
Removing src/main/java/edu/harvard/iq/dataverse/authorization/MyDataQueryHelper.java
Removing src/main/java/edu/harvard/iq/dataverse/authorization/ExternalLinkAuthenticationProvider.java
Removing src/main/java/edu/harvard/iq/dataverse/api/TestApi.java
Auto-merging src/main/java/edu/harvard/iq/dataverse/api/Datasets.java
CONFLICT (content): Merge conflict in src/main/java/edu/harvard/iq/dataverse/api/Datasets.java
Auto-merging src/main/java/edu/harvard/iq/dataverse/DatasetServiceBean.java
Auto-merging src/main/java/Bundle.properties
CONFLICT (content): Merge conflict in src/main/java/Bundle.properties
Auto-merging scripts/deploy/phoenix.dataverse.org/cert.md
Removing scripts/deploy/apitest.dataverse.org/rebuild
Removing scripts/deploy/apitest.dataverse.org/prep
Removing scripts/deploy/apitest.dataverse.org/post
Removing scripts/deploy/apitest.dataverse.org/dv-root.json
Removing scripts/deploy/apitest.dataverse.org/deploy
Removing scripts/api/data/aupr-echo.json
Removing doc/sphinx-guides/source/user/super-user.rst
Removing doc/sphinx-guides/source/user/img/image4institutional.png
Removing doc/sphinx-guides/source/user/img/image3institutional.png
Removing doc/sphinx-guides/source/user/img/image2institutional.png
Removing doc/sphinx-guides/source/user/img/image1institutional.png
Removing doc/sphinx-guides/source/installation/installer-script.rst
Auto-merging doc/sphinx-guides/source/installation/config.rst
CONFLICT (content): Merge conflict in doc/sphinx-guides/source/installation/config.rst
Removing doc/sphinx-guides/source/img/feature-request-process.png
Removing doc/sphinx-guides/source/developers/windows.rst
Removing doc/sphinx-guides/source/developers/ubuntu.rst
Removing doc/Sphinx/source/img/image4institutional.png
Removing doc/Sphinx/source/img/image3institutional.png
Removing doc/Sphinx/source/img/image2institutional.png
Removing doc/Sphinx/source/img/image1institutional.png
Removing conf/R/rserve-startup.sh
Removing conf/R/r-setup.sh
Removing conf/R/Rserv.pwd
Removing conf/R/Rserv.conf
Auto-merging PULL_REQUEST_TEMPLATE.md
CONFLICT (content): Merge conflict in PULL_REQUEST_TEMPLATE.md
Auto-merging .gitignore
CONFLICT (content): Merge conflict in .gitignore
Automatic merge failed; fix conflicts and then commit the result.
murphy:dataverse pdurbin$
At http://irclog.iq.harvard.edu/dataverse/2017-04-24 @solhm and I talked about pull request #3781 and how code review is blocked because it's showing 500+ files changed. He pointed out https://github.com/IQSS/dataverse/compare/develop...solhm:2437-handle-support which shows only 18 files changes, and a code review could be performed on a pull request made from this branch. There are a few changes I'd like to see:
"thanks , i saw it. and I'll take back those changes. And i'll create PR . The PR going to be from https://github.com/solhm/dataverse/tree/2437-handle-support into https://github.com/IQSS/dataverse/tree/develop " -- @solhm at http://irclog.iq.harvard.edu/dataverse/2017-04-25
@solhm thanks for making pull request #3800! I just dragged this issue to Code Review at https://waffle.io/IQSS/dataverse and closed the previous pull requests (#3146 and #3781).
The new test/dev. handlenet server should now be running on dvn-vm7.hmdc.harvard.edu. The software is installed in /usr/local/handlenet/hsj-8.1.1; the handle database is in /usr/local/handlenet/svr_1
The private key you will need in order to register and delete handles is in /usr/local/handlenet/svr_1/privkey.bin
The documentation that came with this version says that they know have a better web-based admin tool:
8) As of version 8.1 a browser-based administration tool is made
available by the handle server. In a browser, you can open
the URL https://
The older command-line admintool should still be working too; (you may or may not need to use either for your testing).
The name space is 20.500.12050
Due to time constraints, we did not complete the code for fully supporting handles in v4.x. Currently handles are presented and resolved correctly but not created. This ticket is intended to restore that functionality.
Please note there were several other tickets that address handle registration robustness, similar to what was done for doi: how we address server not available situations and check for duplicate ids before creating them.