Closed informatics-dev closed 10 years ago
Comment by Simon Rycroft
I have added the required file extensions.
Comment by Alice Heaton
How would users upload the files ? Would they upload them as images or as 'other' file types ?
If they are uploaded as 'other' file types, how would we make the difference between the type of XML files we do want to include in the export, and the type of XML files we do not want to include in the export ?
Comment by Alice Heaton
Adding an option "include in DwC-A archive" to all files would clutter the interface for a feature that will not be used often.
An alternative would be to use a naming convention - so that XML files that should be included in DwC-A archives are named "myfile.dwca.xml".
Comment by Sarah Phillips
Hi Alice, Not sure if you want feedback from us or the rest of the scratchpad team. We are happy for all xml files to be included in the archive, however if you want to provide an alternative way of excluding files such as this naming convention then that's fine. Note that most phylogenies are not in XML format but we could adopt a similar convention e.g. *.dwca.nex or .dwca.nwk. For the portal it does not matter if other XML files are included in the export as (a) by default the harvester will skip all non image files anyway. If the phylogeny and identification key harvesting is enabled then the harvester will try to detect the content type by downloading the xml file and examining it by looking at the xml namespace of the root element
Comment by Alice Heaton
Thanks for your feedback. After discussions here we're also happy to include all XML files. I will do this today.
Comment by Alice Heaton
Which Dublin Core Metadata Initiative type should be used for those files ? Options are Collection , Dataset , Event , Image , InteractiveResource , MovingImage , PhysicalObject , Service , Software , Sound , StillImage , Text.
I will assume Dataset is the correct one - let me know if not.
Comment by Alice Heaton
I have done this and created a branch for it 2295-export-keyfiles-dwca for testing.
Is there a particular site you would like to test this on ?
Note that at this stage the files don't have a license (only images have a license field). If you need those files to have a license this should be opened as a separate issue (once a license field is added to other file types, these would get automatically included in the dwca export)
Comment by Alice Heaton
I've asked Ed to review the changes (to ensure this does not break compatibility with other users of the DwC-A) and assigning this support team for testing.
Comment by Laurence Livermore
As this was an eMonocot feature request I have asked Serene which site(s) already contain these files for the purposes of testing.
Comment by Serene Hargreaves
http://lomandroideae.e-monocot.org/ has an .xml (SDD) file http://lomandroideae.e-monocot.org/sites/lomandroideae.e-monocot.org/files/LomandroideaeGeneraForScratchpad.xml
http://families.e-monocot.org/ has a .nex (nexus) file http://families.e-monocot.org/sites/families.e-monocot.org/files/Monocot_Genera.nex
Comment by Laurence Livermore
Alice, I cannot find the branch "2295-export-keyfiles-dwca" for testing in Aegir. Can you let me know when it's available?
Comment by Simon Rycroft
I have created the platform.
Comment by Laurence Livermore
I attempted to clone http://lomandroideae.e-monocot.org/ to the platform but was unable to (option was not selectable).
I verified the platform but still could not clone the site to platform "2295-export-keyfiles-dwca"
Comment by Simon Rycroft
Can you merge the master branch with the 2295 branch please Alice.
Comment by Alice Heaton
I've merged the branch (there were some conflicts that required manual resolution), update the code on Quartz and verified the platform - so it should now be ready to test (as long as your test site name starts with "dev." or "dev-" ; otherwise wait a bit for the code on Silica to sync).
I did not re-test the fix however ; I don't think there is a conflict with the other code that was added to the export.
Comment by Laurence Livermore
eMonocot sites cloned to branch Redmine issue 2295
http://dev-lomandroideae.taxon.name/ http://dev-families.taxon.name/
Comment by Simon Rycroft
The DwC-A file has been rebuilt. Please check it as soon as is possible.
Comment by Laurence Livermore
Works as expected:
Links to the .xml and .nex files are present in the "images.txt" file in the DwCA.zip from both sites.
Comment by Simon Rycroft
Branch has been merged in to master, and will be included in the next release.
Description:
For storage of character matrix data in the scratchpads (as a precursor to a fully developed character project module) and phylogeny data and to enable the eMonocot Portal to harvest these Ben C suggests:
Users can upload SDD files and Phylogeny Files (Nexus / Newick / New-Hampshire Extended / NeXML / PhyloXML - pick the ones you think are relevant) to the scratchpad. We might need to specify the file extensions which these files should have
SDD = .xml Nexus = .nex Phylip= .phy New-Hampshire Extended = .nhx NeXML = .xml PhyloXML = .xml
These files (if published) should be exported in the images.txt file in the Darwin core archive (see https://docs.google.com/spreadsheet/ccc?key=0AsnL4wtLP8y0dGhzWkRmTmdabDdOamlwelpYS3VaTEE&usp=sharing). For the import to work we require the following fields: dc:identifier= the uri of the file in the scratchpad dc:format=the mime-type of the file
e.g. SDD = application/xml Nexus =text/plain Phylip= text/plain New-Hampshire Extended = text/plain NeXML = application/xml PhyloXML = application/xml
It would also be nice to have other fields dwc:taxonID – the checklist ID of the e.g. root taxon for the key or phylogeny dc:references – the node in the scratchpad, same as identifier dc:creator – the creator of the file dc:description – longer description dc:title – short title dc:subject – keywords
but these are not essential