tdwg / dwc-qa

Public question and answer site for discussions about Darwin Core
Apache License 2.0
49 stars 8 forks source link

Bulk Data Cleaning - Darwin Core Hour Input Form 2/8/2017 12:44:55 #24

Open iDigBioBot opened 7 years ago

iDigBioBot commented 7 years ago

A user submitted this information via the Darwin Core Hour webform: Timestamp: 2/8/2017 12:44:55 Please provide a topic of interest: How to clean/reformat data efficiently and en-masse. EG: Depth measurements in many formats- convert all to meters Are you capable of and interested in participating: No Who else would you recommend to participate in the presentation: John or David Bloom What resources can you point to: OpenRefine? Your name: Ben Frable Your email:

debpaul commented 7 years ago

we need a label for DQ issues, and perhaps a skill needs label? This ticket is really about strategies for cleaning data (tools and skills needs).

pzermoglio commented 7 years ago

Added data quality tag

debpaul commented 6 years ago

i think we need a space (on Twitter? or maybe biology.stackexchange or?) where we can post these questions to lots more people. There must be scripts out there people are using right now that we could point/link to

stijnvanhoey commented 6 years ago

Just some references to existing examples of data cleaning using scripts:

Maybe good to list them elsewehere together with other examples?