ODM2 / YODA-File

The YAML Observation Data Archive & exchange (YODA) File Format
BSD 3-Clause "New" or "Revised" License
4 stars 0 forks source link

Report on validating YODA files in 'examples' directory #51

Open cdesyoun opened 8 years ago

cdesyoun commented 8 years ago

Using validating tool, I tested to validate timeseries YODA files below in examples directory:

  1. https://github.com/ODM2/YODA-File/blob/master/examples/time_series/YODA_TimeSeries_SpCond_LR_Mendon_AA.xlsm
  2. https://github.com/ODM2/YODA-File/blob/master/examples/time_series/YODA_TimeSeries_WtrTemp_LR_Mendon_AA.yaml
  3. https://github.com/ODM2/YODA-File/blob/master/examples/time_series/YODA_TimeSeries_pH_LR_Mendon_AA.yaml First of all, all of these generated YODA files from excel files have YAML format error, for example,
  - &PersonID0010 {PersonFirstName:  "Amber", PersonMiddleName:  NULL, PersonLastName:  "Jones"}
 0 
Affiliations:
  - &AffiliationID0001 {PersonObj: *PersonID0001, OrganizationObj: *OrganizationID0001, IsPrimaryOrganizationContact: NULL, AffiliationStartDate: "2015-01-01 00:00:00", AffiliationEndDate: NULL, PrimaryPhone: NULL, PrimaryEmail: "chris.cox@usu.edu", PrimaryAddress: NULL, PersonLink: NULL}

After fixing this error in all of these files, I ran validation tool and got errors for data types below. And to validate Controlled Vocabulary(CV) names in them was successful.

(venv_odm)client64-147:yoda_tools cyoun$ python yoda.py validate --type timeseries --level 3 -c ../../YODA-File/examples/time_series/YODA_TimeSeries_SpCond_LR_Mendon_AA.yaml 
Type: timeseries Level: 3 CV type: True
Validation Result: False
please look into the generated log file.
CV validation Result: True
(venv_odm)client64-147:yoda_tools cyoun$ python yoda.py validate --type timeseries --level 3 -c ../../YODA-File/examples/time_series/YODA_TimeSeries_WtrTemp_LR_Mendon_AA.yaml 
Type: timeseries Level: 3 CV type: True
Validation Result: False
please look into the generated log file.
CV validation Result: True
(venv_odm)client64-147:yoda_tools cyoun$ python yoda.py validate --type timeseries --level 3 -c ../../YODA-File/examples/time_series/YODA_TimeSeries_pH_LR_Mendon_AA.yaml 
Type: timeseries Level: 3 CV type: True
Validation Result: False
please look into the generated log file.
CV validation Result: True

in the log file, "validate_timeseries.log",

2016-02-22 10:52:23,579 validate_timeseries INFO     Validating YODA file: ../../YODA-File/examples/time_series/YODA_TimeSeries_SpCond_LR_Mendon_AA.yaml
2016-02-22 10:52:37,072 validate_timeseries ERROR    Affiliations.AffiliationStartDate: unconverted data remains:  00:00:00
2016-02-22 10:52:37,085 validate_timeseries ERROR    ActionBy.AffiliationObj: unconverted data remains:  00:00:00
2016-02-22 10:52:37,872 validate_timeseries ERROR    TimeSeriesResultValues: Invalid value ['2015-10-21 19:00:00', -7] (list): 3 items expected, 2 found (at TimeSeriesResultValues['Data'][27917])
2016-02-22 10:52:37,872 validate_timeseries INFO     Validating CV
2016-02-22 11:08:56,129 validate_timeseries INFO     Validating YODA file: ../../YODA-File/examples/time_series/YODA_TimeSeries_WtrTemp_LR_Mendon_AA.yaml
2016-02-22 11:09:09,247 validate_timeseries ERROR    Affiliations.AffiliationStartDate: unconverted data remains:  00:00:00
2016-02-22 11:09:09,260 validate_timeseries ERROR    ActionBy.AffiliationObj: unconverted data remains:  00:00:00
2016-02-22 11:09:10,021 validate_timeseries ERROR    TimeSeriesResultValues: Invalid value ['2015-10-21 19:00:00', -7] (list): 3 items expected, 2 found (at TimeSeriesResultValues['Data'][27917])
2016-02-22 11:09:10,021 validate_timeseries INFO     Validating CV
2016-02-22 11:09:53,257 validate_timeseries INFO     Validating YODA file: ../../YODA-File/examples/time_series/YODA_TimeSeries_pH_LR_Mendon_AA.yaml
2016-02-22 11:10:06,436 validate_timeseries ERROR    Affiliations.AffiliationStartDate: unconverted data remains:  00:00:00
2016-02-22 11:10:06,447 validate_timeseries ERROR    ActionBy.AffiliationObj: Invalid value None (NoneType): must be date_format (at ActionBy[0]['AffiliationObj']['AffiliationStartDate'])
2016-02-22 11:10:07,190 validate_timeseries ERROR    TimeSeriesResultValues: Invalid value ['2015-10-21 19:00:00', -7] (list): 3 items expected, 2 found (at TimeSeriesResultValues['Data'][27917])
2016-02-22 11:10:07,190 validate_timeseries INFO     Validating CV
horsburgh commented 8 years ago

@cdesyoun - it looks like the affiliation start dates on these were not all filled out. That should get rid of most of the validation errors

The formatting error with the "0" after some of the blocks is something we will have to fix. I'll talk to @AmberSJones and see what happened.

cdesyoun commented 8 years ago

@horsburgh I think you fixed affiliation start date from datetime type to date type before. But, those YODA files still were used for datetime type, for example, '2001-01-10 00:00:00'. Also there were some data value records that have missing data value.

horsburgh commented 8 years ago

@cdesyoun - It looks like those files may have been made using the older version of the template that doesn't have my fixes. I'm checking with @AmberSJones, but we can probably fix them pretty quickly.

valentinedwv commented 8 years ago

Update the submodules?

On Tue, Feb 23, 2016 at 11:18 AM, Jeff Horsburgh notifications@github.com wrote:

@cdesyoun https://github.com/cdesyoun - It looks like those files may have been made using the older version of the template that doesn't have my fixes. I'm checking with @AmberSJones https://github.com/AmberSJones, but we can probably fix them pretty quickly.

— Reply to this email directly or view it on GitHub https://github.com/ODM2/YODA-File/issues/51#issuecomment-187849612.

AmberSJones commented 8 years ago

We're going to regenerate those files using the newer version of the template. I'll let you know when they're ready and posted.

cdesyoun commented 8 years ago

@valentinedwv I updated "YODA-File" submodule in "YODA-Tools" on my PyCharm tool.

AmberSJones commented 8 years ago

I have uploaded several templates and associated YODA files that were created using the updated template (version 0.3.2). I am not seeing any issues with the AffiliationStartDate.

There are, however, still some lines with zeroes at the bottom of some of the blocks, and it seems to be different for single time series vs. multiple time series.

@PhilSuiter, can you experiment with these files to see if there are any formatting modifications that eliminate the '0' lines? I can't figure out why it would be different for the multiple time series vs the single time series.

PhilSuiter commented 8 years ago

I re-read the instructions and have been experimenting with this instruction, "'--- To add information beyond the allotted space, simply begin typing in the next row down and the table should automatically extend. If there is no room below the table, insert a row in the middle." I have been messing around with the templates and here is what I have found so far.

     For single time series: I couldn't ever resolve the issue with the 0's at the end of the People, Affiliations, and AuthorList blocks.  When I compared a completed single time series with the blank template, I noticed that the template has a default of 10 available rows for the people and authors in the "people and organizations" and "data citation" tabs. Since there are 11 people listed in the completed file, there must be an issue with the YODA generation not recognizing that last added row.  I tried inserting a row within the middle before copying and pasting, and that 11th row was still not recognized. I tried typing the rows in manually one at a time and that didn't work.     

     For multiple time series:  Upon re-entering the multiple time series data into the blank template, I came across an error when trying to copy and paste the 12 organizations into the organizations block, which has a default of 10 rows.  This error only occurred when I used the "Values (V)" paste option.  There was no error when I used the typical "Paste (P)" paste option.   See the snippits below to view error.

runtimeerror424

codeissue

codeissue2

AmberSJones commented 8 years ago

Can you try those screenshots again? I can't view them.

valentinedwv commented 8 years ago

Phil, I think those images need to be uploaded when you make a comment on the github website. They don't come through on email

PhilSuiter commented 8 years ago

Alright sorry about my technical image problems, but I re-created the error and I'll attach the images below, as well as to my previous comment. This error occurred on my very first step of copying the list of organizations from a complete 0.3.2 multi-time series into the v.0.3.2 blank template. I realized that the error would occur depending on the paste option I used. There was an error when I used the "Values (V)" paste option, but no error when I pasted using "Paste (P)"

runtimeerror424

codeissue

codeissue2

ChristinaB commented 5 years ago

@aufdenkampe Do you have a citation format for examples? For example, if we use YODA_v0.3.3_TS_climate(wHeaders).xlsm as a starter template, I can add the Github URL in the HydroShare resource - unless you have another suggestion.

If you look at the Contributors on this HydroShare resource, I attempted to fill in some metadata, but I don't know how useful it is.

horsburgh commented 5 years ago

@ChristinaB - I clicked on your resource and noticed that you added the YODA-FIle GitHub repository as a contributor to your resource. I don't think that's the best way to reference this work. Instead, I would use either a reference in a readme file included in the resource, add it as key-value metadata or use the Related Resources section in the HydroShare resource. Or, use multiple of these options so it is clear to potential consumers of your HydroShare resource why you are making a connection to the YODA-File repository.

The GitHub repository did not participate as a contributor to your HydroShare resource (think People or Organizations for Contributors).

aufdenkampe commented 5 years ago

Hi @ChristinaB, thanks for using this and chiming in with a question on GitHub!

I like @horsburgh's suggestion to not include the repo as a contributor, but rather to just cite us.

I just created the following citable reference for you!

Sara Damiano, Anthony Aufdenkampe, Jeff Horsburgh, David Valentine, Amber Jones, Jacob Meline, … David Tarboton. (2019, May 13). ODM2/YODA-File: v0.1-alpha: Initial alpha release for testing (Version v0.1-alpha). Zenodo. http://doi.org/10.5281/zenodo.2796960

Let me know if this works to meet your need.

ChristinaB commented 5 years ago

Yes! I added the reference as a Source in the HydroShare resource and removed it from the contributor list. Thanks @aufdenkampe