Closed pameyer closed 6 years ago
This includes:
Initial commit for UI prototype added to branch 4610-upload-dual-mode
. Changes include:
Tasks to tackle in the next phase of development:
This probably needs it's own issue, but the documentation for how to set up Dropbox for file upload is severely lacking. It took me far too long to find these required steps in the Dropbox documentation. We should link to that page from our guides to make it easier to set up. Currently, we provide no instructions.
3db4486 was so far develop
that I couldn't even run the code so I merged the lastest in fb4e4f3. Heads up to @dlmurphy that I went with the version of @mheppler 's branch, which will undo fe59aef so you two should get on the same page about that tooltip.
@pdurbin, I just talked with @mheppler and we're good with that merge. That tooltip will likely be further edited in the near future as part of the 5.0 redesign process anyway.
I took a quick look at pull request #4862 and noticed that there are merge conflicts in the bundle.
@sekmiller fixed the merge conflict in b5f81e3 and I offered to help with weaving this new feature into the guides somehow. He's going to take a look at the render logic for installations that haven't been configured with Dropbox or DCM. Right now it's all visible, including some placeholder text that @dlmurphy or others might want to weigh in on:
Question for @pameyer (and others): Is it ok if it's not possible to turn off "native" or "traditional" file upload? I believe you have it turned off it your fork but it's not possible in the pull request as of this writing. I checked with @sekmiller
Yes, it does make sense to retain the ability for different upload methods to be configurable (at minimum at the installation level, possibly at the dataverse level).
@pameyer ok, thanks. @sekmiller we can't just put !settingsWrapper.rsyncUpload
back in to disable "Upload with HTTP via your browser". (Also, I just noticed that we need to move that English out of editFilesFragment.xhtml and into the bundle.) I guess we need a setting or something to disable "Upload with HTTP via your browser". @pameyer in your mind when "Upload with HTTP via your browser" is disabled does that mean that APIs that use HTTP are disabled as well? Or would that be a different setting?
I think it's better to be consistent with APIs and HTTP UI (aka - if HTTP browser uploads are disabled, HTTP upload APIs should also be disabled).
@sekmiller and I spoke with @djbrooke @pameyer @landreev and @oscardssmith this morning. In 3ba0036 I adjusted the FileUploadMethods
enum to rename "NATIVE" to "native/http" in preparation for making it possible to disable http upload for both GUI and API. @sekmiller and I believe it will be better to have "native/http" explicitly set as a database setting out of the box rather than having implicit logic in the code. So we'll be updating setup scripts to add it. I also added a TODO to add "native/dropbox" if we feel like it helps makes the code and user experience for sysadmins better. Some discussion about if it would be better to move the Dropbox key from a JVM option to a database setting. (I'd probably leave it alone but I don't have a strong opinion about this.)
Pull request #4862 is looking good, I think, so I moved it to QA. Heads up to @mheppler and @dlmurphy that in 4a7fcf5 I removed some help text that I didn't find very helpful. If someone can suggest better text, please advise and it can be put back in.
Issues so far:
Rsync is now working, refer to checklist above for issues.
Native and Dropbox work in basic UI testing. Could not get Rsync to work by following dcm_mock instructions. Actually noticed a server log error: Caused by: java.io.IOException: Dataset doi:10.5072/FK2/7L67Q5 is not locked for DCM upload batch-jobs log says, SEVERE: Dataset {0} is not locked for DCM upload. Exiting
Server log error:
[2018-07-20T18:35:19.966-0400] [glassfish 4.1] [SEVERE] [] [job-73] [tid: _ThreadID=342 _ThreadName=concurrent/__defaultManagedExecuto rService-managedThreadFactory-Thread-2] [timeMillis: 1532126119966] [levelValue: 1000] [[ Dataset doi:10.5072/FK2/VFIW5V is not locked for DCM upload. Exiting]]
[2018-07-20T18:35:19.966-0400] [glassfish 4.1] [WARNING] [] [com.ibm.jbatch.container.impl.JobThreadRootControllerImpl] [tid: _ThreadI D=342 _ThreadName=concurrent/__defaultManagedExecutorService-managedThreadFactory-Thread-2] [timeMillis: 1532126119966] [levelValue: 9 00] [[ Caught throwable in main execution loop with Throwable message: java.io.IOException: Dataset doi:10.5072/FK2/VFIW5V is not locked fo r DCM upload, and stack trace: com.ibm.jbatch.container.exception.BatchContainerRuntimeException: java.io.IOException: Dataset doi:10. 5072/FK2/VFIW5V is not locked for DCM upload at com.ibm.jbatch.container.artifact.proxy.JobListenerProxy.beforeJob(JobListenerProxy.java:45) at com.ibm.jbatch.container.impl.JobThreadRootControllerImpl.jobListenersBeforeJob(JobThreadRootControllerImpl.java:303) at com.ibm.jbatch.container.impl.JobThreadRootControllerImpl.originateExecutionOnThread(JobThreadRootControllerImpl.java:103) at com.ibm.jbatch.container.util.BatchWorkUnit.run(BatchWorkUnit.java:80) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at org.glassfish.enterprise.concurrent.internal.ManagedFutureTask.run(ManagedFutureTask.java:141) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) Caused by: java.io.IOException: Dataset doi:10.5072/FK2/VFIW5V is not locked for DCM upload at edu.harvard.iq.dataverse.batch.jobs.importer.filesystem.FileRecordJobListener.beforeJob(FileRecordJobListener.java:177) at com.ibm.jbatch.container.artifact.proxy.JobListenerProxy.beforeJob(JobListenerProxy.java:43) ... 10 more ]]
[2018-07-20T18:35:19.966-0400] [glassfish 4.1] [SEVERE] [] [job-73] [tid: _ThreadID=342 _ThreadName=concurrent/__defaultManagedExecutorService-managedThreadFactory-Thread-2] [timeMillis: 1532126119966] [levelValue: 1000] [[ Job Failed. See Log for more information.]]
Need to test API enable/disable, mixed mode, rsync only mode. Need to follow up on above questions.
Some notes from conversation about Add New Dataset page with Phil, Kevin, and Tania:
Make rsync window collapsible like the other two.
Properly indent the messaging in the rsync window
Replace the text in the rsync window with:
rsync allows you to run a script that transfers your files into or out of Dataverse via SSH. It is useful for extremely large files (>1GB), or packages containing a large number of files. Once you have saved this dataset, you can upload your data using rsync via the "Upload Files" button on the dataset page.
Sketch that hints at render logic for dropbox and rsync panels:
Replace the text in the rsync window
Fixed in 7452e66
Properly indent the messaging in the rsync window
Fixed in a867f5b
collapsible
My understanding is that there is still work to be done in this area. @djbrooke will do housekeeping on the issue.
Will discuss with @mheppler and other interested parties when he's back next week.
Thanks @sekmiller and @mheppler for working together to generate a checklist of remaining items here. I'll be happy to help with questions of scope, etc.
Talked to @sekmiller. Outline of cleanup:
DatasetPage.workingVersion.hasPackageFile
DatasetPage.workingVersion.hasPackageFile
render logic that disables Upload Files btn in filesFragment.xhtmlrsyncUpload_popup
from dataset.xhtmlAt standup this morning I volunteered to deploy the code to the dev1 server in various configurations:
As of ec8aa56, native + dropbox only has been deployed: https://dev1.dataverse.org
Latest and greatest to-do checklist based on review of the new feature summary doc with @pameyer and @pdurbin and @dlmurphy.
isHasRsyncScript()
render logic to deliver more informative msg than just "Rsync script not available!" Should add "I'm broken" and "I'm busy" logic, with appropriate msg text for each (i.e. "contact Support", or "try again later"). Done in 23f3beb.Still to consider, which might ultimately get added to the to-do list:
Document with in-app messaging for this feature: https://docs.google.com/document/d/14mkaH_YW1jP-xlFMtLuW-xDuAx_3g3FbIuVJ_0YhLAE/edit
My new revisions to the messaging are highlighted in green.
Next step for me is to work on the documentation.
I just committed the bulk of the Dual Mode documentation, but on Monday morning I want to briefly check around the rest of the guides and make sure other sections about rsync are still accurate.
I'm ready for code review as of cf041d5 so I dragged this issue over in https://waffle.io/IQSS/dataverse
Hold up, I have one more doc commit to make, then I'm ready too.
OK, docs are done too now, ready for code review.
@matthew-a-dunlap I addressed your code review feedback in 65407cd . Please check it out and let me know if there's anything else.
Discussed briefly with @sekmiller and @kcondon and here is what I have as the outstanding issues:
Happy to discuss in more detail if needed to get this to done.
rsync upload, unpublished Message should use email address rather than the name of the root dataverse per DMurphy:
rsync upload, published
rsync upload, published
On c7c6b7e
DCM only mode:
ui-datatable-tablewrapper
might be too short; clicking the "edit files" menu results in the appearance of a scrollbar. Seen with single package file present in dataset, may apply to other cases with single file.api/info/metrics/downloads/toMonth
endpoint returns 0
; but won't have any useful information (and in dual-mode, this info will be incomplete).Dual mode:
/dataCaptureModule/rsync
API endpoint still produces valid transfer script. scanner still picks it up, results in mixed dataset (endpoint would be expected to return an error if there's a native file).
Initial thinking is that it makes sense to defer metrics-related issues to a separate issue (which intersects with multiple storage locations).
rsync upload, unpublished
Messages needs to be updated to use support email addresses:
[x] http section should have support email rather than Harvard Dataverse Support: HTTP upload is disabled for this dataset because you have already uploaded files via rsync. If you would like to switch to HTTP upload, please contact Harvard Dataverse Support.
[x] rsync section should also have support email rather than null: You cannot upload additional files to this dataset. A dataset can only hold one data package. If you need to replace the data package in this dataset, please contact null.
adb73a4000f42a8ffca8e8dbb24754a31e14cd00 ; in DCM only mode
[x] Suggestion for slight wording change "Upload files using rsync via SSH. This method is recommended for large file transfers. The rsync upload script will be available on the Upload Files page once you save this dataset." -> "Upload files using rsync via SSH. This method is recommended for large file transfers. The upload script will be available on the Upload Files page once you save this dataset."
[x] "Redownload DCM Script" button might need better wording, and visually should be on the other left so the inactive "upload files" button doesn't move. Also seems that counter-intuitive that the "redownload DCM script" button sends the user the same page as the "upload files" button would have.
[x] If we need a "redownload DCM script button", it shouldn't be visible once there's a package in the dataset.
[x] was able to reproduce the ui-datatable-tablewrapper
behavior:
[ ] selecting a file causes the file table to move down due to appearance of "1 file is currently selected..." Possible UX problem with things jumping up and down.
As @pameyer suggested above, the button added for "Redownload DCM Script" when the dataset is locked for upload, needs some attention. This issue was moved back into develop in order to solve this. The button currently links to the File Upload pg which is the same page the disabled "Upload Files" btn links to.
In an attempt to deliver something closer to what was initially suggested, I have managed to get this Frankenstein looking button and message block solution mocked up. (It is only mocked up because when you click on the button you don't get the script, but instead the page refreshes and the dataset lock warning msg is no longer displayer.)
Will be working to get something similar to this solution, or a text link -- if possible, but there were concerns about the onclick
nature of this link, as opposed to a URL you can navigate to.
After reviewing options with the team, it was determined the easiest solution would be to make the "Download DCM Script" button on the dataset pg actually download script, instead of link to the Upload Files pg. (What made it even easier was that the DatasetPage.java bean already had the DatasetPage.downloadRsyncScript()
code in it!)
Moving back to QA with the blessing of @pameyer. Will address his last "jumping up and down" UX issue in another issue or later in this QA cycle.
On 92ec40559e37dfd3c795f15b0732ff4165de460a
big-data mode:
dual-mode:
A little more investigation on the mixed dataset; native upload APIs are blocked when a DCM upload lock is present, but not blocked when there's already a package file in the dataset.
checked in fixes for both of @pameyer 's notes above. moving back to QA assigning back to Pete.
@sekmiller Thanks; fixes look good.
It looks like I've run out of ways to break this; the closest I came was inconsistent UI state:
one way native upload can be inactive
the other way native upload can be inactive
There were suggestions from the UI review yesterday (https://docs.google.com/document/d/1Cshqx10BnYKy8HWEPFgI7Dho8ObXSN8zo48mP3t9Egk); @djbrooke are we considering those (chevron icons; red error -> yellow warning; link in "please contact support") in-scope for this issue, or splitting them off?
Thanks @pameyer - we should do those suggestions as part of this issue.
@dlmurphy can you put together a checklist from the notes doc and add it here?
Checklist from UX Review:
[X] Add text to end of rsync upload script: "ALL TRANSFERS COMPLETE. Dataverse is now verifying the checksum and your upload should appear in the UI in ~5 minutes." @pameyer, feel free to rephrase this, the main thing is just to set expectations for when the file will appear in the UI.
[x] Upload method panel header need chevron icons to indicate open/closed state, similar to this metadata block, for example:
[x] As soon as one http file is uploaded, the rsync upload window should change its message to the warning message. You shouldn’t need to click save first.
[x] rsync upload warning message when http files are added -- message should be yellow warning style, not red error style.
[x] In message when DCM fails to generate script - “Please contact support” should link to the support modal.
[x] When you delete a file uploaded with HTTP, you end up with an empty upload pg:
Added chevron icons to upload method headers. Fixed warning msg styling. Added new render logic for missing msgs when HTTP file is added. de33aec8f88981ffcea0f7f0286326dea99165eb
Had to remove !empty EditDatafilesPage.fileMetadatas
render logic from the dataTable to get the "There are no selected files to display." empty table msg on the Edit Files pg to display again. Unfortunately that also return the "Files you upload will appear here." empty table msg on the Create Dataset and Upload Files pgs.
As a repository operator, I'd like to be able to support Dataverse native HTTP (browser and API) uploads and DCM uploads in the same installation. Similarly, I'd like to be able to support Dataverse native HTTP, S3 redirect (, swift?) download methods alongside RSAL downloads in the same installation.