clarity-h2020 / ckan

CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers datahub.io, catalog.data.gov and europeandataportal.eu/data/en/dataset among many other sites.
https://ckan.myclimateservice.eu/
Other
0 stars 0 forks source link

T2.2 Data Collection - M18 Review #17

Closed ghilbrae closed 4 years ago

ghilbrae commented 5 years ago

As you are well aware, we have received the results of the last review. The main points are:

Please read the two pages we've extracted from the document to get a better picture. T2.2 Review Comments.docx

On harmonization we had an issue that can be consulted for more info: https://github.com/clarity-h2020/ckan/issues/5

p-a-s-c-a-l commented 5 years ago

On harmonization we had an issue that can be consulted for more info: #5

The issue #5 is about meta-data harmonisation, not harmonisation / interoperability of data.

ghilbrae commented 5 years ago

I've noticed but I don't know if we should also consider how we did the meta-data harmonization and if what we've put in CKAN follows our definition.

p-a-s-c-a-l commented 5 years ago

In theory, if we create the Data Packages for all DCs we'll get a description that follows the Data Package Meta-Data specification. This description can then be copied into CKAN.

ghilbrae commented 5 years ago

After one month we have no new information in regards of this task.

We are kindly asking all DC leaders to review this and update both your data and the D2.2.

According to the document we attached to our first comment on this issue:

WP2: The report submitted is actually not convincingly in line with the vision for deliverable 2.2. The authors need to go through and acknowledge any changes from the DoA as at the moment it doesn’t read like a catalogue of data sources for each DC. The DoA would lead the reader to expect more on approaches for standardisation and normalisation which would lead to the more consistent treatment of data across DCs. For example, the Spanish DC the most well formed giving both a clear description of the data as well as where it has been obtained from.

Please, review both the description given in the document in terms of the datasets selected and also the elements and files published in CKAN. As was discussed during the plenary meeting in Wien, you should take into account the EU-GL taxonomy when updating the information in CKAN. Also note that, as commented by the reviewer, any modification or deviation from the description of the DC done in the DoA should be justified or explained in some way.

WP2: As highlighted in the recommendations section of the report, it is unclear whether D2.2 delivers what was intended and as such appears to be a deviation from the DoA. As identified in the last review, there is still very little on harmonisation / interoperability of data.

(See above) It's also important to have a certain synchronization among the DCs, especially those that share the same framework (urban DCs).

From the report: ‘It should be noted that it is not the objective of this task to ensure that the data collected can cover the information needs defined in the user cases. Its objective is to ensure that adequate procedures have been established for the collection of information and to give support to the members of the consortium in the generation of a first set of data in order to ensure that the methodology is understood and that it meets the expected objective.’

This disclaimer for this deliverable is not aligned with the DoA which states: ‘D2.2: This deliverable will provide a catalogue of local data sources for each of the demo applications’

This has already been addressed in the working version of the document.

There are inconsistencies in the way that this is presented per DC. Table 5 (Spanish DC is best) with a clear description of the data AND where it is coming from. It reads less of a wishlist. This is highlighted as a deviation in the interim report (p30) but is somewhat buried. Overall, there is a need to highlight the deviations from DoA and more clearly state the actual vision for this deliverable. It is also recommended that more is included on approaches for standardisation, normalisation etc?

Same as above, justify any deviations from the DoA and improve the descriptions and data included both in the report and CKAN.

Please, if you consider that it is necessary to discuss the issue of standardization and harmonization further, let us know and we can discuss this during one of our Tuesday's meetings.

ghilbrae commented 5 years ago

Dear all, please remember to spare a thought for this issue... Be aware that if you are updating or registering your Data Package you are already dealing with some of the stuff that it is needed here.

DenoBeno commented 5 years ago

Has this been fixed?

DenoBeno commented 5 years ago

Can we close this issue?

p-a-s-c-a-l commented 4 years ago

Can we close this issue?

I don't know what the current status of this task is.

ghilbrae commented 4 years ago

This task is at it was on August, we have not received any feedback or updated information from any of the partners.

I don't know when is it planned to submit the updated documents but we should not wait until the week before.

p-a-s-c-a-l commented 4 years ago

Any progress here? IMHO this is also relevant for the next and last Data Management Plan ...

ghilbrae commented 4 years ago

Dear all, this issue has been opened for six months and there's no feedback yet.

Please take a look at it and update your documents, ckan, and the spreadsheet as required.

LenaStr commented 4 years ago

Dear Angela,

For DC2 we updated all documents CKAN and CSIS in early fall so I think we have done this. I will make a new round in August.

However, for the partners that are not so involved in the technical development of the CSIS, we need some other way of being notified than via GitHUB as we are not using it regularly. (I got notifications, but so much so it is hard to see what is relevant for me.)

We also need instructions to different systems and links to make the tasks achievable as the system has changed rapidly during the development.

Best Lena

Från: Angela Rivera [mailto:notifications@github.com] Skickat: den 3 december 2019 10:03 Till: clarity-h2020/ckan Kopia: Strömbäck Lena; Assign Ämne: Re: [clarity-h2020/ckan] T2.2 Data Collection - M18 Review (#17)

Dear all, this issue has been opened for six months and there's no feedback yet.

Please take a look at it and update your documents, ckan, and the spreadsheet as required.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/clarity-h2020/ckan/issues/17?email_source=notifications&email_token=AB2NFIBSH6SXNTOLYCUOAGLQWYOE5A5CNFSM4H2OC442YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFYTQOA#issuecomment-561068088, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB2NFIG4QGTIVOL5P5J5W73QWYOE5ANCNFSM4H2OC44Q.

ghilbrae commented 4 years ago

Dear Angela, For DC2 we updated all documents CKAN and CSIS in early fall so I think we have done this. I will make a new round in August.

Thanks Lena, we'll take a look at it and come back if necessary.

However, for the partners that are not so involved in the technical development of the CSIS, we need some other way of being notified than via GitHUB as we are not using it regularly. (I got notifications, but so much so it is hard to see what is relevant for me.)

I understand this and I'm sorry. It was decided that this method of communication was better to keep track of everything. I don't know if the admin team can help with that and unsubscribe you from any project that does not concern you. Maybe you can ask them to help you with that.

We also need instructions to different systems and links to make the tasks achievable as the system has changed rapidly during the development.

Are you referring to the CSIS? CKAN and the other stuff have not changed but the CSIS has and it's still changing. If you need help with that you'll have to ask Denis.

Best

p-a-s-c-a-l commented 4 years ago

Related Plenary Meeting Actions and Decisions:

ghilbrae commented 4 years ago

@LenaStr @claudiahahn @RobAndGo @mattia-leone

To fulfill this task it is necessary that you check the data and ckan (datasets+resources) that you have specified on the corresponding sections in document D2.2 is the data that you are actually using for your demo cases. If you also take a look at the text it would be great but we'll be giving all the DCs a final read to ensure that they all sound similar.

We are going to add a new section to the document dedicated to the European Data Package, so it is not necessary for you to include this information, what must be included are any resources that are needed for your expert studies.

Please take a look at the information given regarding the meteorological data from TuTiempo.net (@mattia-leone ) and Radar-based precipitation (historical??) (@LenaStr ).

This week, we will be sending you all a list of the datasets in CKAN specifying if the URLs work, check any broken or wrong ones.

We intend to submit a new version of the document early next week, so send your contributions as soon as you can.

LenaStr commented 4 years ago

Angela, can you please give me the link to the latest version of T2.2 in OwnCloud. I could only find one old document with review comments.

/Lena

Från: Angela Rivera [mailto:notifications@github.com] Skickat: den 21 januari 2020 12:08 Till: clarity-h2020/ckan Kopia: Strömbäck Lena; Mention Ämne: Re: [clarity-h2020/ckan] T2.2 Data Collection - M18 Review (#17)

@LenaStrhttps://github.com/LenaStr @claudiahahnhttps://github.com/claudiahahn @RobAndGohttps://github.com/RobAndGo @mattia-leonehttps://github.com/mattia-leone

To fulfill this task it is necessary that you check the data and ckan (datasets+resources) that you have specified on the corresponding sections in document D2.2 is the data that you are actually using for your demo cases. If you also take a look at the text it would be great but we'll be giving all the DCs a final read to ensure that they all sound similar.

We are going to add a new section to the document dedicated to the European Data Package, so it is not necessary for you to include this information, what must be included are any resources that are needed for your expert studies.

Please take a look at the information given regarding the meteorological data from TuTiempo.net (@mattia-leonehttps://github.com/mattia-leone ) and Radar-based precipitation (historical??) (@LenaStrhttps://github.com/LenaStr ).

This week, we will be sending you all a list of the datasets in CKAN specifying if the URLs work, check any broken or wrong ones.

We intend to submit a new version of the document early next week, so send your contributions as soon as you can.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/clarity-h2020/ckan/issues/17?email_source=notifications&email_token=AB2NFIE44BR75OXS3RH2NS3Q63JQ5A5CNFSM4H2OC442YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJPLTKY#issuecomment-576633259, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB2NFIBDQNR4VTK73WDKU7TQ63JQ5ANCNFSM4H2OC44Q.

ghilbrae commented 4 years ago

@LenaStr I'm quoting the email you sent me in case it's useful to others.

In the document we do not list datasets, we are referring to an excelsheet and to CKAN. Where is the Excel? Still relevant to update this? The spreadsheet is available online: https://docs.google.com/spreadsheets/d/1YoWHFDbrdEFKgiis8yoMVihxcFm1HIoy8iRwBTNS0JI/edit#gid=1037233366

Though it would be nice to have it up to date, the really important thing to keep updated is the CKAN (and Zenodo when applicable). The spreadsheet was our first tool to easily compile the datasets we were going to use.

There has been a lot of changes to our focus in DC2 since the production of this. (In principle we have concentrated on some of the usecases described. Should I just remove old ones or should I make an explanation on why we have not continued to work with them.

In terms of data, please do remove them. If you'd like to write a short explanation on this changes of focus due to your interactions with users, for example, it can of course be included so as to illustrate how feedback is influencing the development of our use cases, approach, and consequently, datasets.

Best

LenaStr commented 4 years ago

I just uploade a new version ov D2.2 with updated information on DC2.

/Lena

Från: Angela Rivera [mailto:notifications@github.com] Skickat: den 22 januari 2020 09:43 Till: clarity-h2020/ckan Kopia: Strömbäck Lena; Mention Ämne: Re: [clarity-h2020/ckan] T2.2 Data Collection - M18 Review (#17)

@LenaStrhttps://github.com/LenaStr I'm quoting the email you sent me in case it's useful to others.

In the document we do not list datasets, we are referring to an excelsheet and to CKAN. Where is the Excel? Still relevant to update this? The spreadsheet is available online:

Though it would be nice to have it up to date, the really important thing to keep updated is the CKAN (and Zenodo when applicable). The spreadsheet was our first tool to easily compile the datasets we were going to use.

There has been a lot of changes to our focus in DC2 since the production of this. (In principle we have concentrated on some of the usecases described. Should I just remove old ones or should I make an explanation on why we have not continued to work with them. In terms of data, please do remove them. If you'd like to write a short explanation on this changes of focus due to your interactions with users, for example, it can of course be included so as to illustrate how feedback is influencing the development of our use cases, approach, and consequently, datasets.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/clarity-h2020/ckan/issues/17?email_source=notifications&email_token=AB2NFIB2RSJSJRNXV4II7CDQ7ABJVA5CNFSM4H2OC442YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJSWKVQ#issuecomment-577070422, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB2NFIEUJINGOUHJ5FJWIMTQ7ABJVANCNFSM4H2OC44Q.

ghilbrae commented 4 years ago

Thanks!!

ghilbrae commented 4 years ago

We have uploaded a new version to owncloud including Lena's updates but we still have no new inputs from either Plinius (@mattia-leone @stefanon ) or ZAMG (@claudiahahn @RobAndGo ). Can you please give a look at the document and update your data/info?

claudiahahn commented 4 years ago

@ghilbrae, we are working to update CKAN. We will include all climate indices that were calculated, even though they are not all available in CSIS yet? Do you want us to update text in section 3.1.1 and / or include a section called data catalogue like for the DCs - refering to CKAN?

ghilbrae commented 4 years ago

Yes, please. CKAN can be updated independently on its own.

claudiahahn commented 4 years ago

names in CKAN are different to the files names on the ftp. We pobably have to change that, I am afraid?
e.g. for Summer days, the entry within CKAN is: SD 2011-2040 RCP2.6 but the file name is HI_summer-days_rcp26_20110101-20401231_ensmean.nc and HI_summer-days_rcp26_20110101-20401231_ensstd.nc That is, this CKAN entry needs to be renamed to .HI_summer-days_rcp26_20110101-20401231_ensmean, and another entry added to account for HI_summer-days_rcp26_20110101-20401231_ensstd Is this correct?

ghilbrae commented 4 years ago

names in CKAN are different to the files names on the ftp. We pobably have to change that, I am afraid?

I don't know, to be honest. The FTP is just something between us to exchange information and should not be a guideline for our naming conventions. CKAN should have a good description of the data and a working link to them.

So for example, SD 2011-2040 RCP2.6, might be a good name for that resource (this is for you to determine) but the url should point to the place were the data really is, it currently points to a wrong ZENODO link. This needs to link to the HI_summer-days_rcp26_20110101-20401231_ensmean.nc file. Each case might be different, you might need to point to Zenodo, geoserver or your own url, depending on the data.

e.g. for Summer days, the entry within CKAN is: SD 2011-2040 RCP2.6 but the file name is HI_summer-days_rcp26_20110101-20401231_ensmean.nc and HI_summer-days_rcp26_20110101-20401231_ensstd.nc That is, this CKAN entry needs to be renamed to .HI_summer-days_rcp26_20110101-20401231_ensmean, and another entry added to account for HI_summer-days_rcp26_20110101-20401231_ensstd Is this correct?

The problem I see in CKAN is that what we have there right now does not account for that ensmean and ensstd that we have. I don't know if the best option might be to change the names of the existing ones SD 2011-2040 RCP2.6 -> SD 2011-2040 RCP2.6 ensmean (along with the URL) and add an extra one for the ensstd.

Does this make sense to you?

claudiahahn commented 4 years ago

ok, we have not yet uploaded the netcdf files to Zenodo or CCCA. Till this is done, we would just leave the link to the general Zenodo Clarity website, if that is ok. Or do you want us to insert the link to the geoserver? But that might change as well, right?

claudiahahn commented 4 years ago

I have updated D2.2. Not sure if that is how you want it to be. Please take a look. PLINIVS should look over it to make sure, what I have written regarding the local models etc. is correct.

ghilbrae commented 4 years ago

we have not yet uploaded the netcdf files to Zenodo or CCCA. Till this is done, we would just leave the link to the general Zenodo Clarity website, if that is ok. Or do you want us to insert the link to the geoserver? But that might change as well, right?

If you'll have to change the links later, I see no point in changing them now, I think this can wait.

I have updated D2.2. Not sure if that is how you want it to be. Please take a look. PLINIVS should look over it to make sure, what I have written regarding the local models etc. is correct.

Thanks a lot! We have yet to hear from Plinius (@mattia-leone & @stefanon) , but we hope they can check both their contribution to DC1 and this.

RobAndGo commented 4 years ago

I have started to upload the data to zenodo - it is much easier to do than I thought it would be!! (Wolfgang's exasperated comments about uploading things to zenodo always made me think that this labour was one which Hercules himself would fail to accomplish!)

The zenodo link to the data is also being inserted into CKAN.

I have a question about the "Download Path(s)" link for the data package resources within CSIS. At the moment I have used links from the geoserver, e.g. https://clarity.meteogrid.com/geoserver/wcs?SERVICE=WCS&VERSION=2.0.1&REQUEST=GetCoverage&COVERAGEID=europe:HI_summer-days_rcp45_20110101-20401231_ensmean&FORMAT=geotiff&boundingbox=2145500,982500,6606000,5706500,urn:ogc:def:crs:EPSG::3035

Do I have to change this link to the zenodo link? Or is this geoserver link necessary for displaying the data to the user in the map-display of CSIS?

p-a-s-c-a-l commented 4 years ago

@RobAndGo You can safely change the download paths link. Map and table use the resource references (Example). All references have to be updated, when the Geoserver Instance changes.