Closed kbraak closed 9 years ago
(No text was entered with this change)
Original issue reported on code.google.com by wixner
on 2010-12-07 10:43:19
Should no data be available for some section, the section must still be displayed, with
text "Not available". This allows editors to see what sections are incomplete, and
encourage the author to go back and add to their metadata.
Original issue reported on code.google.com by kyle.braak
on 2011-02-01 08:40:56
(No text was entered with this change)
Original issue reported on code.google.com by wixner
on 2011-02-01 11:34:04
Majority of the code committed in r2990, only Eml2Rtf.java needs to be extended for
all details
Original issue reported on code.google.com by wixner
on 2011-02-02 15:27:06
(No text was entered with this change)
Original issue reported on code.google.com by oliver.meyn
on 2011-02-16 10:29:09
The code is almost finished.
The file IndFauna.rtf is a generated document using the attached eml.xml file.
Its very similar to the proposal, except some details where I'm not very clear.
ISSUE1:
- How should be the format of the text to manage the metadata multiplicity?
As an example, there is only one Taxon Coverage in the proposal file, but the IPT permits
to add several.
Same problem happens for Dataset Descriptions, Curatorial Units, Temporal Coverages,
Method Description.
I made some examples to the multiplicity problem in the generated file to show how
they would look in the document.
ISSUE 2:
- In Project description section, the IPT permit to add only one Personnel without
email. But in the proposal document there are several Personnel information with emails.
Original issue reported on code.google.com by htobon
on 2011-03-10 19:16:22
The content written in red, should be inserted manually.
There are several ways to format the multiplicity.
One of them is to repeat all (title and content), as I did with "Curatorial unit".
Or add only one title and divide the content using bullets (or numbering?), as I did
in "Methods" section.
Original issue reported on code.google.com by htobon
on 2011-03-10 19:35:39
In the last example I forgot to add a "Formation Period" and a "Living Time Period"
in the eml.xml.
Please see these new attached files
Original issue reported on code.google.com by htobon
on 2011-03-14 14:49:19
This morning I committed changes that ensure the RTF gets generated during publish,
and the RTF download is available on the publish resource and manage resource pages.
The eml file you included above did not validate against the GBIF or EML schemas. This
couldn't have been generated from the IPT was it? I modified the eml file so that it
validates successfully - please see it included in the attached DwC-Archive.
When trying to import my DwC-Archive (that I also modified to use 2 letter lang attributes)
into the latest built IPT (Version 2.0.2-SNAPSHOT-r3123), I get the error: "lang argument
needs to be a 2 letter language code". Please see the attached screenshot. This error
prevents resource publishing and hence prevents RTF creation.
TODO:
1. (Most urgent) Resolve the import problem with the lang code so that I can keep testing
the RTF
2. Add logging information to the Publish Status page for RTF generation just like
we do for the generation of the DwC-Archive
Original issue reported on code.google.com by kyle.braak
on 2011-03-16 13:57:02
Some details that I could notice:
First of all, the IPT is generating the lang attribute using 3 letters. I don't know
why when you changed manually, you put 2 letters.
the code <title xml:lang="en"> must be changed for <title xml:lang="eng">
In the other hand, looking into the gbif eml schema, the lang atttribute does not specifies
the iso format. So..in theory, the lang attribute could be used with 2 or 3 letter.
That's why you did not get an error.
One question here: Currently the IPT throws an error when the lang attribute has 2
letters, should we validate and transform it to 3 letters while import process?
second thing:
I got 3 errors in the validating process for eml.xml agains gbif schema.
1. There was a citation label empty (Daniel told me this error was fixed on feb
21 in the dwca-reader project)... so.. a new dwca-reader project needs to builded
2. formatVersion label: Invalid character encountered. (the content must be a decimal
number). In the schema, the formatVersion label is Decimal. But I think this label
needs to be xs:string because some format version could be 1.1.2.3 or 1.2.3-SNAPSHOT,
not necessary as decimal (using only numbers and only one point). The schema should
be changed:
<xs:element name="formatVersion" type="xs:decimal">
should be change for
<xs:element name="formatVersion" type="xs:string">
3. FormationPeriod label is in a wrong order (Daniel recently fixed - he is going
to make the commit today).
Original issue reported on code.google.com by htobon
on 2011-03-16 16:10:37
Sorry, I meant to report that the error occurred after pressing publish (the import
was fine)
I'm pretty sure Daniel ensured the both 2 and 3 letters are handled on import from
what i remember with this issue: 629
Whether using 2 letter lang or 3 letter lang the error still appears the same: "lang
argument needs to be a 2 letter language code"
In addition to the problems against the GBIF schema, there were also a couple against
the EML schema. I noticed these using the online validator http://tools.gbif.org/dwca-validator/eml.do
Please see that eml file included in the attached archive as a guide.
Original issue reported on code.google.com by kyle.braak
on 2011-03-16 16:33:33
fixed error: "lang argument needs to be a 2 letter language code"
r3125
Original issue reported on code.google.com by htobon
on 2011-03-16 23:13:24
Great, that error seems to be gone now.
There are still 2 problems with the eml. Please use a reliable XML editor and the online
validator http://tools.gbif.org/dwca-validator/eml.do to help determine whether the
outputted eml is validating against the GBIF and EML schemas
Bad order:
<specimenPreservationMethod></specimenPreservationMethod>
<formationPeriod></formationPeriod>
<livingTimePeriod></livingTimePeriod>
Good order:
<formationPeriod></formationPeriod>
<specimenPreservationMethod></specimenPreservationMethod>
<livingTimePeriod></livingTimePeriod>
Bad order:
<methodStep>
<description>
<para></para>
</description>
</methodStep>
<qualityControl>
<description>
<para></para>
</description>
</qualityControl>
<sampling>
<studyExtent>
<description>
<para></para>
</description>
</studyExtent>
<samplingDescription>
<para></para>
</samplingDescription>
</sampling>
Good order:
<methodStep>
<description>
<para></para>
</description>
</methodStep>
<sampling>
<studyExtent>
<description>
<para></para>
</description>
</studyExtent>
<samplingDescription>
<para></para>
</samplingDescription>
</sampling>
<qualityControl>
<description>
<para></para>
</description>
</qualityControl>
Original issue reported on code.google.com by kyle.braak
on 2011-03-17 13:00:06
ok, I already have fixed the first problem in my workspace, I'll fix the second and
send the commit.
Original issue reported on code.google.com by daniel.amariles88
on 2011-03-17 14:18:45
elements order changed in http://code.google.com/p/darwincore/source/detail?r=1383
Original issue reported on code.google.com by daniel.amariles88
on 2011-03-17 16:42:16
I am using r3130, the EML seems not fully validated yet. See comment 13, in the file
attached, these elements are still arranged as
<specimenPreservationMethod></specimenPreservationMethod>
<formationPeriod></formationPeriod>
<livingTimePeriod></livingTimePeriod>
which should be
<formationPeriod></formationPeriod>
<specimenPreservationMethod></specimenPreservationMethod>
<livingTimePeriod></livingTimePeriod>
Original issue reported on code.google.com by burkeker.gbif
on 2011-03-18 09:40:50
yes, you are right.
A new dwca-reader version is going to be built soon.
Original issue reported on code.google.com by htobon
on 2011-03-18 13:22:17
1. Add several taxa: This field does not save any data.
2. Project Data: Field Personnel First Name and Personnel Last name shows only the
first name of the person and last name is missing. Further this field does not allow
adding more than one person.
3. Curatorial Units: Data entry under Method type “Count with uncertainty” does not
work.
4. Formation period: Should we have provision to enter Formation Period?
Original issue reported on code.google.com by pusagate
on 2011-03-24 10:27:41
(No text was entered with this change)
Original issue reported on code.google.com by kyle.braak
on 2011-03-24 12:45:00
1. Add several taxa: You are right, this field does not save any data. The main objective
of this field is to provide to the user to enter a list of scientific names (one per
line) and thus, make the metadata entry much easier. However, until the user click
on the "save", the information is not saved in the EML file, and for that reason, it
will not be shown in the RTF document.
2. Project Data: Is correct, the last name is now displaying in the RTF Document (see
r3141). On the other hand, the current IPT version only allows the addition of one
personnel because of the GBIF EML schema specifications. In addition, not only have
to change this scheme, but also objects relations in the project dwca-reader. (Perhaps
this new implementation takes place in a future IPT version).
3. Curatorial Units: I made a quick test, and it seems that the curatorial unit fields
are working well. If the method type is "Count with unvertainty" with value [method
type] and the fields "Count" and "+/-" have values [X] - and [Y] respectively, then
the RTF document will show: "Curatorial Unit: X with an unvertainty of Y (method type)"
4. Excuse me, can you please explain me what do you mean with "provision"
Original issue reported on code.google.com by htobon
on 2011-03-24 15:16:46
by the way, there is currently an issue that propose to add several personnels information
in Project Data section. (please see issue 558)
Original issue reported on code.google.com by htobon
on 2011-03-24 15:21:24
Curitorial Unit: When you select method type "Count with Unsertainity" it accepts the
data and also prints in RTF however it does not displays it in IPT after data entry
has been saved.
After entering data in this field this is what IPT outputs. See screen 1
And after saving data in this field, user can not go back to see what they entered.
Hope it makes sense.
Original issue reported on code.google.com by pusagate
on 2011-03-25 10:54:44
The other issue I just observed is that each time you edit metadata and click on publish
it generates new EML and system name EML file as EML-1, EML-2 and EML-3. But system
does not generate new RTF after meta data has been edited and saved.
Original issue reported on code.google.com by pusagate
on 2011-03-25 10:57:00
Only the latest RTF is kept, there are no older versions archived. Is this required
for anything?
Original issue reported on code.google.com by wixner
on 2011-03-25 14:30:45
About Curatorial Unit: Apparently there was a problem with numbers that had more than
3 digits. Problem was fixed in dwca-project.
(Details: http://code.google.com/p/darwincore/source/detail?r=1387)
Original issue reported on code.google.com by htobon
on 2011-03-25 15:24:00
About versioning RTF File: I don't see the reason to make this functionality. However,
is done in r3145.
Original issue reported on code.google.com by htobon
on 2011-03-28 13:00:40
Following testing by Zookeys:
see attached file):
Joe Cora1, 1, 3, Elijah Talamas1, Norman Johnson1
1 Ohio State University, 1315 Kinnear Road, 43212, Columbus, United States; 2 Ohio
State University, Columbus, United States
Corresponding authors: Joe Cora (cora.1@osu.edu), (hol-help@osu.edu)
Why Joe Cora has " 1, 1, 3 " but the addresses are only two and why
"hol-help@osu.edu" appears in the Corresponding authors line ?
Original issue reported on code.google.com by kyle.braak
on 2011-04-07 16:07:14
"super scripts" number fixed. There was a problem when some authors had not last name
in your metadata(only organisation).
Corresponding authors section was also fixed.
r3152
please see the attached file. (generated by IPT).
Original issue reported on code.google.com by htobon
on 2011-04-07 19:05:11
Superscript problems again:
authors: I get 1, 3, 4, 5 (no 2)
addresses: I get 1, 2, 3, 4
I don't mind so much if the count isn't entirely incremental (ie skipping author 2),
but certainly address 5 should be listed
I attach the archive and outputted RTF to help you debug.
Original issue reported on code.google.com by kyle.braak
on 2011-04-18 12:59:08
Yes.. there was another bug with agents who have same name (first/last) and same address,
but different email and homepage.
In the attached file, the address number 5 suppose to be the address 4, and so on.
Fixing the repeated agents problem was also fixed the address problem.
r3168 r3169
Could you please check it again?
Original issue reported on code.google.com by htobon
on 2011-04-18 18:29:44
Author/Address super script problem appears to have been fixed.
I have several new issues that I have discovered (archive and rtf attached)
1) Outputted line from RTF generated reads: "Specimen preservation method: Deepfrozen".
The words Deep and frozen should be separated. Please ensure this is working for all
specimen preservations.
2) In the specification document DataPaperMappingwithMetadata.doc, it says "combinations
of ‘Organisation Name’, ‘Address’, ‘Postal Code’, ‘City’, ‘Country’ and ‘Email’ will
constitute the address". The emails are all missing from the addresses in the RTF right
now.
3) In the specification document DataPaperMappingwithMetadata.doc, section Natural
Collections Description is supposed to have element collectionIdentifier. This is missing
from the RTF right now.
Also a comment, we need to find out whether in the References section at the end, the
identifier for the bibliography should also be included - according to the specification
it's only the citation element that's displayed and not the citation identifier.
Original issue reported on code.google.com by kyle.braak
on 2011-04-19 14:40:45
r3172
1) done. Now it search into the vocabulary file.
2) done, however, the specification document says:
"......‘Country’ and ‘Email’ will constitute the address. If two or more author shares
same address, it will be denoted by same number. "
What would happend if two agents have same address but different email?
What email should be inserted, the first one?
Kyle told me that in this case, both address should be different (Is working in this
way with the new changes). But... are you sure? An address is composed only by: Address,
City, State, Country and Postal Code. And... What about if users have different phone
numbers? or Home Pages?
3) done.
Other question:
In corresponding authors section, the document DataPaperMappingwithMetadata.doc says:
"...In case both creator and metadataProvider is the same, creator is the reflected
as corresponding author".
How should I compare them? Currently I'm comparing using address information (address,
city, province, country and postal code). But, as you mentioned before, should I compare
also the emails? and... again the doubt.. should I compare also the home pages and
the phone numbers?
Original issue reported on code.google.com by htobon
on 2011-04-19 16:19:11
Glad to see 1, 2, and 3 fixed.
Regarding your first question, if they have different phone numbers or home pages it
doesn't matter according to the specification. These fields don't get displayed on
the RTF anyways.
Regarding your second question, you should use the name and address that includes the
emails in your comparison. Otherwise, you could conceivably have two people with the
same name, in the same department, at the same address. The email would be the key
indicator in this case.
Original issue reported on code.google.com by kyle.braak
on 2011-04-20 10:13:33
oka Kyle.. got it!..
new issues were received by email.
Copy/Paste here to inform all.
------------------
Writed by Lyubo and Teodor.....
1. A potential problem is that metadata authors do not fill in all important fields
and as a result we might get some very brief and schematic manuscripts like that one
in the attached file. Such manuscripts would merit immediate rejection which will discourage
authors. We can put VERY EXPLICIT requirements in our Author Guidelines, however the
authors may look at these after they have filled in the metadata.
In other words, do you see a possibility to design most important metadata fields as
mandatory?
2. I would rename the "download RTF" menu to Create a manuscript (RTF) - it is a way
more clear and "tempting"
3. The description field should perhaps be renamed to "Concise description" or Abstract.
.....
------------------
Original issue reported on code.google.com by htobon
on 2011-04-20 15:54:33
about new issues in (Data paper-test2-metadata-3-RTF.rtf):
[D1]. I think this issue were fixed already.
[D2]. fixed. (changed for "Corresponding author(s)").
[D3]. fixed. (changed for "Concise description").
[D4]. I don't understand what link should I reference.
[D5]. "... Please avoid capitalizing in headings..". The following headers were changed:
"Combination of authors, ..."; "Common name"; "Taxonomic ranks"; "General taxonomic
coverage description"; "Taxonomic coverage"; "General spatial coverage"; "Spatial coverage";
"Temporal coverage"; "Design description"; "Study area descriptions/descriptor"; "Natural
collections description".
Is ok? if there is something more that needs to be changed, just tell me.
[D6] fixed. (changed for "Dataset description").
[D7] Is implemented now. All those fields whose content is empty are avoided in the
generated document.
changes were made in r3180 and r3181.
Original issue reported on code.google.com by htobon
on 2011-04-20 17:16:05
Dear Hector,
I an attaching the new RTF file with small, mostly cosmetic corrections.
Happy Easter
Original issue reported on code.google.com by Lyubo.penev
on 2011-04-21 13:41:01
oka.. no problem.
I will check them on next monday.
H.
Original issue reported on code.google.com by htobon
on 2011-04-22 02:14:43
changes done in r3184
About the new issues from test-21-04-2011-metadata-4-LP-corrections.rtf file:
[D1] "Repeating names when the same person participates in different roles is certainly
a problem...".
- Fixed in previous revisions (r3180 and r3181).
[D2] "If only one corresponding author is listed, the comma should not be present
here".
- Fixed in previous revisions (r3180 and r3181).
[D3],[D5],[D8],[D11],[D15],[D16] "Can you insert an empty paragraph after this heading?".
- Done. A new line after the following headers were added: Concise description, Taxonomic
coverage, Spatial coverage, Project description, Natural collections description,
Methods and Datasets.
[D4],[D6],[D7],[D9],[D10],[D12],[D13],[D14] "Coverage should not be capitalized", etc..
- All this changes were done in previous revisions (r3180 and r3181).
[D17],[D18],[D21],[D22] "Where these question mark appear from? Please check and eliminate
them"
- I certainly don't know why this question marks are appearing. Could you please, send
me the eml.xml file?
[D19] "As more than one datasets are usually provided (e.g., in DwC-A), we should insert
this heading (with an empty paragraph thereafter) before the start of the first dataset
description."
- done.
[D20] "Please delete “s”".
- Fixed in previous revisions (r3180 and r3181).
Original issue reported on code.google.com by htobon
on 2011-04-26 18:16:27
about test-21-04-2011-metadata-5-LP.rtf document:
changes done in r3188 and r3189
Not all the comments were fixed since there are a couple that certainly we don't know
exactly how to resolve them.
[D1] "Any idea how we can avoid repeating addresses? I know it is not that easy as
it looks like…"
- The script is based on the first requirements.
[D2] "...concise description in IPT. Here, in the data paper it must be Äbstract..."
- done
[D3] "Put colon “:” after Ketwords and move the keywords thereafter (no paragraph
after the heading".
- done
[D4] "Move the taxa thereafter (no paragraph after the heading. Do not capitalize
Ranks, should read “ranks”".
- title changed to Taxonomic ranks.
- New line removed.
[D5] " For large collections this section can become VERY large..."
- Not fixed yet. Neither we know how to format the rank list.
[D6] "Do not capitalize "
- title changed to Collection identifier
[D7] "1. There is a confusion here with the publication date of the paper itself. Please
change it to:
“Publication date of data:”
2. Why date is shown as “111”here instead of 28, e.g. 2011-04-28. Must be a bug"
- Title changed to Publication date of data
- Bug fixed.
Original issue reported on code.google.com by htobon
on 2011-04-28 15:45:02
new changes:
r3192:
- Adding a metadata resource link.
- Organising taxonomic ranks. (one per line).
r3194:
- Emails omitted for affiliations.
Original issue reported on code.google.com by htobon
on 2011-05-10 14:23:33
r3204.
Metadata resource link will appear only for public or registered resources.
Original issue reported on code.google.com by htobon
on 2011-05-12 19:23:40
Title changed:
"Metadata resource link" changed for "Data published through GBIF: <metadata resource
link>"
GBIF word is linked to www.gbif.org
r3207
Original issue reported on code.google.com by htobon
on 2011-05-16 21:12:36
Resolution of this issue will require User Manual additions, right? Will it need a tutorial?
Original issue reported on code.google.com by gtuco.btuco
on 2011-05-17 16:48:41
Please roll back your changes from yesterday, and ensure changes are accompanied with
an update to the corresponding issue.
I don't think you have addressed the change needed to the public resource page, for
example http://ipt.pensoft.net/ipt/resource.do?r=waspsperu : "Archive" should be renamed
to "Darwin Core Archive"
Original issue reported on code.google.com by kyle.braak
on 2011-05-18 11:18:23
r3209
Changes were done for the following specifications:
a) If data is uploaded/published through the IPT, the "Dataset description" describes
the DwC-A being published by the IPT, and all other "External links" entered in the
IPT appear under a new section called "External datasets". The DwC-A dataset description
would look like the following:
Dataset description
Object name: Darwin Core Archive [insert Resource name here]
Character encoding: UTF-8
Format name: Darwin Core Archive format
Format version: 1.0
Distribution: http://ipt.gbif.org/resource.do?r=uniqueName
Publication date of data: 2011-04-21
Language: English
Licenses of use: IP Rights-field
b) If no data is uploaded/published through the IPT but there are one or more "External
links", the "Dataset description" reads: "There is no dataset published through Darwin
Core Archive format for this resource. Currently described datasets are listed in the
section "External datasets", and accordingly the "External links" are listed in the
"External datasets" section.
c) If there are no "External links" entered in the IPT, the section "External datasets"
in the RTF does't appear at all.
d) If there is no data uploaded/published through the IPT, and no "External links",
then that means only the metadata is being published and no "Dataset" and "Dataset
description" sections will appear at all.
Original issue reported on code.google.com by htobon
on 2011-05-18 13:12:13
"Archive" word was renamed renamed to "Darwin Core Archive" in r3207 for Overview page.
and for Public Resource page in r3210.
Original issue reported on code.google.com by htobon
on 2011-05-18 13:21:51
I miss this little change.
Object name: Darwin Core Archive [insert Resource name here]
is done. r3211
Original issue reported on code.google.com by htobon
on 2011-05-18 13:35:39
a) NOT fixed. You are taking the first external link and using it as the main dataset
description - why? The specification above says:
If data is uploaded/published through the IPT, the "Dataset description" describes
the DwC-A being published by the IPT, and all other "External links" entered in the
IPT appear under a new section called "External datasets".
That means the external links are only ever described in the External datasets section.
It also means the DwC-A being produced by the IPT is equal to the main dataset description,
so hard-code the values:
Object name: Darwin Core Archive [insert DwC-A Resource name here]
Character encoding: UTF-8
Format name: Darwin Core Archive format
Format version: 1.0
Distribution: [insert DwC-A download link here]
b) NOT fixed. When I finally add a source file and click publish the main dataset description
is not filled. If I press it again, it gets filled but of course using the 1st external
link which is wrong (relates to a) )
Furthermore please change the text to only read: There is no dataset published through
Darwin Core Archive format for this resource. Currently described datasets are listed
in the section External datasets.
c) NOT fixed. Again related to a) if I have a DwC-A being published I should have a
dataset description even if I don't have any External links.
d) FIXED, but so good at making things disappear please see the issue below..
NEW - where has the "Dataset" heading disappeared to? It does not show up anymore above
"Dataset description" This obviously needs to be put back
Original issue reported on code.google.com by kyle.braak
on 2011-05-19 10:31:13
Everything on the Datasets sections appears fixed now, thanks.
We are showing the language and licenses (IP Rights) even if no DwC-A is produced.
If this is not desired, please state that here.
Original issue reported on code.google.com by kyle.braak
on 2011-05-19 16:28:50
Yes, you are right Kyle.. I misunderstood some concepts... I apologise.
I made new changes in. r3217 r3218 and r3219
Original issue reported on code.google.com by htobon
on 2011-05-19 16:34:03
Original issue reported on code.google.com by
timrobertson100
on 2010-11-22 13:26:46issue-471-attachments.zip