zenrabbit / Open_data_timer

How to decide when to publish data
0 stars 0 forks source link

notes #1

Open zenrabbit opened 3 years ago

zenrabbit commented 3 years ago

hi Emily,

Knowledge Network for Biocomplexity. Data repo. example: https://knb.ecoinformatics.org/view/doi%3A10.5063%2FRF5SF0 I prefer to publish when the fieldwork is done for several reasons. a. better than excel files etc and forces me to tidy up data. b. much easier to share a link with all the meta-data - see above (saves so many emails with the team), c. fieldwork is done and paid for so data should be open. d. 'Primary' purpose of the 5 years data collection is done and published I assume? For me, I owed mike the year 1 telemetry paper, I tidied data, did stats, Taylor wrote it up, then mike took lead and got it in print. Then, after three years, it was time to pay back TNC and BLM too with a paper at the end of the formal agreements with them both (and canada too) - so Jenna did all the stats and I lead the resource selection function paper. SO, we were kind of done. So, if you feel you have satisfied your primary purpose with years 4 & 5, sharing it just makes all our lives easier. Main reason. Plus, I like getting it out in the world - sometimes people catch errors too and email the authors. SO awesome. That happened to me recently, before the paper. TOTALLY saved my bacon.

Now could someone else find it and use it? Sure. And I say, enjoy 🙂 They are unlikely to write the same paper(s) we may or may never write about micro-habitat use or movement, haha. I do not think they could because it would be so hard without being there. BUT they could find a new use we are even less likely to do. My experience is not extensive here, but most data reuse is for another new purpose not really linked to the field methods so to speak. SO no risk to us.

So, for me, with the Canadian funding model, and I know that totally only applies to years 1-3, I gotta do it with or without paper cause taxpayers paid, but all the reasons above really do just make my life easier anyway for collaboration with the team. In R, you can even read the directly from the repo and it saves so much hassle.

Here is the telemetry data that we published, only the first year, associated with the Westphal lead paper in ECE: https://knb.ecoinformatics.org/view/doi%3A10.5063%2FF1736P23

I will put years 1-3 online at KNB if you do not approve, totally cool, and I get it if you are not prepared. You and Mike own years 54 and 5, your call 100%!!!

For the Reports paper I lead, I did already publish the years 1-3 data here: https://figshare.com/articles/dataset/Telemetry_of_the_lizard_species_Gambelia_sila_at_Carrizo_Plain_National_Monument/8239667

had to - because nature owns figshare (not KNB) and needed the data online for that paper.

best, chris.

zenrabbit commented 3 years ago

calculate mean lag by IF citations to data - test timing even for data online by duration not pre-paper test benefit in own data - ie data in KNB and then data of paper assess extent of web open data