gbif / portal-feedback

User feedback for the GBIF API, website and published data. You can ask questions here. 🗨❓
30 stars 16 forks source link

Need for basic documentation on usage of GBIF data #66

Open gbif-portal opened 7 years ago

gbif-portal commented 7 years ago

Need for basic documentation on usage of GBIF data

Need for basic documentation on use of GBIF portal data download (and API)

First, the new portal feels and looks great, and is a big step forward, so I don't mean to sound too negative in the below, but this is an issue remark, right?

One thing that I feel that is largely forgotten in the new (and the old) portal is easy access to basic usage information. If the goal is to attract new user groups it should be possible to find, without enterprising into a google marathon, a clear and precise beginners guide to GBIF (and DwC). The IPT Wiki have some nice pages on what DwC is (https://github.com/gbif/ipt/wiki/DarwinCore.wiki), and also on how to publish occurrence, event and checklist data. At least I believe - I have seen those pages but can't find them on the fly again - which kind of illustrates the problem. This kind of information is maybe boring to the tribal community using the portal on a day to day basis, but it's absolutely needed in order for new user to get to that point without loosing interest. Possible user groups span widely in terms of technical know-how and background, although one must assume some basic knowledge on how to use a computer.

This does not need to be a very resource demanding thing to do. Again, there are wiki pages for the IPT, but not for using GBIF data. To be fair, the new portal does come a long way in this, and particularly the links on the tools/users tabs are easy to find.

For example, but by no means not limited to:


fbitem-67cbf865735ab20856e1e6649eab1c4073bc341e Reported by: anders.finstad@ntnu.no System: Firefox 51.0.0 / Ubuntu 0.0.0 Referer: https://demo.gbif.org/ Window size: width 1855 - height 923 API log&_a=(columns:!(request,response,clientip),filters:!(),index:'prod-varnish-',interval:auto,query:(query_string:(analyze_wildcard:!t,query:'response:%3E499%20AND%20(request:%22%2F%2Fapi.gbif.org%22)')),sort:!('@timestamp',desc))&indexPattern=uat-varnish-&type=histogram) Site log&_a=(columns:!(request,response,clientip),filters:!(),index:'prod-varnish-',interval:auto,query:(query_string:(analyze_wildcard:!t,query:'response:%3E399%20AND%20(request:%22%2F%2Fdemo.gbif.org%22)')),sort:!('@timestamp',desc))&indexPattern=uat-varnish-&type=histogram)

kbraak commented 7 years ago

Just to add, many others were in agreement with this proposal, brought forward yesterday during the NSG meeting.

kbraak commented 7 years ago

The following two handbooks can assist GBIF curate data sets it indexes and can help GBIF understand how to guide users in a) curating the data they download from GBIF.org and b) using it properly in accordance with data terms of use/license. Notably the second volume details steps a researcher might take to curate a data set from receiving the data to eventual reuse. It also contains a number of case studies Please see below:

Curating Research Data, Volume One: Practical Strategies for Your Digital Repository.

Volume One of Curating Research Data explores the variety of reasons, motivations, and drivers for why data curation services are needed in the context of academic and disciplinary data repository efforts. The following twelve chapters, divided into three parts, take an in-depth look at the complex practice of data curation as it emerges around us. Part I sets the stage for data curation by describing current policies, data sharing cultures, and collaborative efforts underway that impact potential services. Part II brings several key issues, such as cost recovery and marketing strategy, into focus for practitioners when considering how to put data curation services into action. Finally, Part III describes the full life cycle of data by examining the ethical and practical reuse issues that data curation practitioners must consider as we strive to prepare data for the future.

Curating Research Data, Volume Two: A Handbook of Current Practice.

Building from the introductory base established in Volume One.. the steps in Volume Two will detail the sequential actions that you might take to curate a data set from receiving the data (Step 1) to eventual reuse (Step 8). Notably, this handbook focuses on the data curation practices and techniques taken by curation staff in a digital repository setting, yet these steps will be valuable for anyone facilitating data management support for research data, regardless of the final destination of the data (e.g., long-term archiving). Therefore data curators, archivists, research data management specialists, subject librarians, institutional repository managers, and digital library staff will benefit from these current and practical approaches to data curation.

@dschigel @ahahn-gbif

kcopas commented 6 years ago

We're going to keep hearing that last one until we add more formats and/or label what we provide as .tsv (see latest https://twitter.com/ColinJCarlson/status/1023598320034480129)

andersfi commented 6 years ago

Thank's, the "what is Darwin Core" page is a great start!

A couple of points that I think could be added. E.g. what about providing a link in the download mail/download information witch resolves to a very short and basic information page witch has the main purpose of pointing the user to further information. There is a lot of resources out there, they are just a bit hard to find for those not fully assimilated into the tribe, technical documentation on format (what is the encoding, structure etc.. - this is actually quite important as opening a download file for the first time in e.g. R may require some guesswork and a lot of frustration before you get the details right). Also, there are several GBIF specific terms in the interpreted download file that is not DwC. Easy accessible "documentation" on this (e.g. speciesKey etc.) would be great.

Idea for the long run: whereas there is a great wiki on the IPT to help publishers, there is no similar resource to help the users of the data.

dschigel commented 6 years ago

@andersfi I really like your points, and clear documentation on data use is indeed a very needed improvement. I have in fact made a sketch of the data use guide some time ago, please let me know if you are willing to see the draft and revise TOC; you inputs on what should be there and not would be particularly valuable as you know the realities of GBIF and the data user world.