IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
876 stars 484 forks source link

Files: Extra download options #1066

Closed eaquigley closed 9 years ago

eaquigley commented 9 years ago

Should downloading a subset happen on Dataverse or TwoRavens side?

eaquigley commented 9 years ago

Options for under download button:

landreev commented 9 years ago

@eaquigley : wait, what's going on with this menu? are those really checkboxes, like in your comment above? I do remember, from that meeting with James and Vito and Mike, that we didn't want to have checkboxes in the menu. (because that would require another "download" button; and we want the download to start as soon as they select an item from the menu). Once you finalize how the menu will look like, please give this ticket to Mike, to make the menu, and then pass the ticket back to me, so that I could wire the menu items to the backend functionality. The "subset" part I'm assuming will have to wait, until we have a GUI for selecting individual variables. (we may still be waiting for something from James to make this happen). (there's a chance that we don't this menu to be complicated at all... I recall everybody at that meeting was leaning toward just having "file + information" - that will just provide a zip with the tab file, variable metadata (ddi) and citation; without even bothering to provide options for each of these things separately... if in doubt, maybe you could run it by Merce one more time?)

eaquigley commented 9 years ago

@landreev apologies for the confusion, I don't mean for that checklist above to represent what should be happening in the UI, that was to better organize the list. @mcrosas and I talked more after the meeting about this dropdown menu and were going back and forth on if a user should be able to download individual files or all of them at once as the default. @mcrosas, to confirm, do we want a user to be able to download individual files instead of getting a zip file of everything as the default?

mercecrosas commented 9 years ago

I think that the default should be the zip file with all files, but we should still provide the option to download individual files separately.

On Mon, Dec 1, 2014 at 10:14 AM, Elizabeth Quigley <notifications@github.com

wrote:

@landreev https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_landreev&d=AAMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MoES6dokjPLLcKaEAd7qaCuTcYZ4jLjEOBQnbbJ9BaA&m=lfUTRFbxaI1xk5kepcUQzmXfz0SQtzXPxzz8Y3RhrSs&s=4WMBBxm-GV1NYBE1BUtPh9_b6nW9GCBlkJL47eSkFpo&e= apologies for the confusion, I don't mean for that checklist above to represent what should be happening in the UI, that was to better organize the list. @mcrosas https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_mcrosas&d=AAMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MoES6dokjPLLcKaEAd7qaCuTcYZ4jLjEOBQnbbJ9BaA&m=lfUTRFbxaI1xk5kepcUQzmXfz0SQtzXPxzz8Y3RhrSs&s=GaxeIeBvP1EauiTHjHKvsTWkgdhI5XNlPzb-Ccp_ohA&e= and I talked more after the meeting about this dropdown menu and were going back and forth on if a user should be able to download individual files or all of them at once as the default. @mcrosas https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_mcrosas&d=AAMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MoES6dokjPLLcKaEAd7qaCuTcYZ4jLjEOBQnbbJ9BaA&m=lfUTRFbxaI1xk5kepcUQzmXfz0SQtzXPxzz8Y3RhrSs&s=GaxeIeBvP1EauiTHjHKvsTWkgdhI5XNlPzb-Ccp_ohA&e=, to confirm, do we want a user to be able to download individual files instead of getting a zip file of everything as the default?

Reply to this email directly or view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_IQSS_dataverse_issues_1066-23issuecomment-2D65078250&d=AAMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MoES6dokjPLLcKaEAd7qaCuTcYZ4jLjEOBQnbbJ9BaA&m=lfUTRFbxaI1xk5kepcUQzmXfz0SQtzXPxzz8Y3RhrSs&s=guvFm5m8Jkr9mOLXiwJhgUj-3HvrmgSzTYSzVn8yqwg&e= .

eaquigley commented 9 years ago

Okay, I'm passing this to @mheppler to make the menu as @landreev requested. I checked the files mockups and the dropdown menu is mocked up the way @mcrosas confirmed in the last comment. Mockup found here: https://iqssharvard.mybalsamiq.com/projects/files-dataverse40/Dataset%20-%20Files

eaquigley commented 9 years ago

one last thing to confirm @mcrosas, in the mock up we have the sub menu for file citations but we had also discussed a user would select file citation and get both formats. so should we implement: a) sub menu with file citation options to select from or b) no sub menu and user gets both formats when they click file citation under the download.

mheppler commented 9 years ago

Here is a wireframe from my localhost that shows the Option A as it would be developed.

http://mheppler.pagekite.me/dataset.xhtml?id=3

landreev commented 9 years ago

OK, I like Mike's mockup a lot; no checkboxes/multiple choices in the menu is great. Let's just adopt it and move ahead. I may not have time to push this into beta 9 though.

mheppler commented 9 years ago

Checked in static placeholder wireframe for the Data File Citation submenu on the Download button for data files on the dataset page.

landreev commented 9 years ago

OK, I'll see what I can do with this for beta 9... May have to disable some of the menu items, if I can't get them all to work in the next couple of hours.

landreev commented 9 years ago

@mcrosas @eaquigley Can you point me to any documentation on what the RIS and EndNote citations should look like for a Datafile?

eaquigley commented 9 years ago

@posixeleni any insight for this? i did a quick search and need some clarification for @landreev Based off the wikipedia stuff for RIS, I'm guessing RIS for data file would be: TY - DATA AU - PY - TI - (some others here, maybe publisher or database or date accessed?) ER -

landreev commented 9 years ago

@eaquigley @posixeleni The fields you listed above - these are the same ones we use for datasets; I'm assuming everything in the RIS citation for a dataset we can safely reuse in the citaiton for a datafile that belongs to it. (we do something similar with file/study citations in 3.6). But are there any sensible fields where I can stick the extra information? such as the file name for example - should I just add it to the DO? like - DO - doi/10.5072/FK2/2, myfile.txt

and of course I need similar info for the endnote format.

eaquigley commented 9 years ago

@landreev @posixeleni could we use C1 (custom field) for the UNF? @posixeleni could we use T2 for the data file name? i couldn't find anything for endnote so need to hear back from Eleni about that one.

landreev commented 9 years ago

@posixeleni @eaquigley For RIS, I'm going to do what Elizabeth suggested (C1 and T2); if we can think of something better before the release, we can change it. Thank you Liz.

posixeleni commented 9 years ago

@landreev Modeling off of @eaquigley suggestions for RIS: for the Data Citation export in EndNote XML you can put (see also #881):

file name under:

<secondary_title>testfile.jpg</secondary_title>

UNF under:

<custom1>UNF</custom1>

So it would look something like this (based off ICPSR https://www.icpsr.umich.edu/icpsrweb/ICPSR/rsxml/studies/3259):

<?xml version='1.0' encoding='UTF-8'?>
<xml>
<records>
<record>
<ref-type name="Dataset">216</ref-type>
<contributors>
<authors>
<author>Privileged, Pete</author>
</authors>
</contributors>
<titles>
<title>Dataset Title</title>
<secondary_title>text.tsv</secondary_title>
</titles>
<section>2014-09-10</section>
<dates>
<year>2014</year>
</dates>
<publisher>Root Dataverse</publisher>
<urls>
<related-urls>
<url>http://dx.doi.org/10.5072/FK2/216</url>
</related-urls>
</urls>
<custom1>UNF</custom1>
<electronic-resource-num>doi/10.5072/FK2/216</electronic-resource-num>
</record>
</records>
</xml>

Please let me know if I can help clarify. Also to make sure this exports in a valid EndNote XML please also refer to this site as well: http://www.ferroic.at.ethz.ch/publications/xml_format Although they dont claim 100% valid EndNote XML either due to the lack of official documentation from EndNote

eaquigley commented 9 years ago

@mheppler: @vjdorazio is working on the subsetting pop up. I wanted to make sure that we don't forget to add the Subset link into the download dropdown so this is simply a reminder about that.

landreev commented 9 years ago

I changed the subject of the ticket. (removed the word "subset"). Subset download is actually handled in somewhere else. This issue is for the extra download options - the ones that had "coming soon" next to them in beta 9. These are: file citation in RIS format; file citation in EndNote (XML) format; (the citation is basically the same as the dataset-level citation, but with the filename and unf added). And also, the "All File Formats + Information" - these produces a zip file with the tab file, the original file, the ddi and the 2 citation files in the formats above. RData is NOT included yet (for files ingested from non-R originals, that is). That is handled in ticket #1179. Once it's fixed, I'll add it to the menu and to the bundle.

kcondon commented 9 years ago

Tested functionality described above. Download works as described, ris and endnote are formatted as described. The file extensions appear incorrect, both text rather than .ris and .xml. I have not yet validated endnote format by loading it in to endnote app but will do that next. @posixeleni @landreev The file extensions for ris and endnote citation should be .ris and .xml, according to #881, right? They both .txt for files and .txt and .xml for dataset.

posixeleni commented 9 years ago

@kcondon Yes these files should be exported in the extensions you mentioned:

For RIS the extension should be .ris For EndNote XML it should be .xml

See this ICPSR record for reference (scroll down to Export Citation): https://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/3259

Otherwise the bibliographic citation software that these are meant to be imported into may not import the files properly.

landreev commented 9 years ago

OK, will change shortly...

landreev commented 9 years ago

OK, done. All RIS citation files should have .ris extensions now (for datasets, datafiles, and for datafiles as part of the "download everything" zip bundle). All endnote citations should have .xml extensions.

landreev commented 9 years ago

OK, per our discussion, 1) fixed the DOI (should be using the Identifier, not the (db) Id; 2) Switched to using the ref-type "Online Database" (type 45 in EndNote, TY "DBASE" in RIS); The filename is now Custom1 (c1 in ris), the UNF is Custom2 (c2). Why is this important? - my comment from the java class DatasetServiceBean:

    // "Ref-type" indicates which of the (numerous!) available EndNote
    // schemas this record will be interpreted as. 
    // This is relatively important. Certain fields with generic 
    // names like "custom1" and "custom2" become very specific things
    // in specific schemas; for example, custom1 shows as "legal notice"
    // in "Journal Article" (ref-type 84), or as "year published" in 
    // "Government Document". 
    // We don't want the UNF to show as a "legal notice"! 
    // We have found a ref-type that works ok for our purposes - 
    // "Online Database" (type 45). In this one, the fields Custom1 
    // and Custom2 are not translated and just show as is. 
    // And "Custom1" still beats "legal notice". 
    // -- L.A. 12.12.2014 beta 10
kcondon commented 9 years ago

This is working now, as described.

Closing