Fireandplants / plant_gbif

This repository is for data and scripts related to plant species distribution across the globe using the Global Biodiversity Information Facility (GBIF) dataset.
4 stars 2 forks source link

Scripts to filter GBIF records and query climate and biome #17

Closed dmcglinn closed 9 years ago

dmcglinn commented 9 years ago

These commits update a batch of scripts used in Zanne et al. 2013 posted online here: http://datadryad.org/resource/doi:10.5061/dryad.63q27.2/4.1

The master script to understand the flow of these scripts is the Rscript: ./scripts/geog_filter/run_all.R

The following tasks are performed:

dschwilk commented 9 years ago

Great, Dan! How long does this take to run? I'll take a quick look at merge into master.

dschwilk commented 9 years ago

How many records and how many taxa in the end in gbif_coords.csv? Or is this done? I gave the commits and the overall steps a quick look through and it looks great.

dmcglinn commented 9 years ago

The export is still running its on genera starting with the letter S heading to Z - I should know by tonight.

Dan On 03/16/2015 02:46 PM, Dylan Schwilk wrote:

How many records and how many taxa in the end in gbif_coords.csv? Or is this done? I gave the commits and the overall steps a quick look trough and it looks great.

— Reply to this email directly or view it on GitHub https://github.com/Fireandplants/plant_gbif/pull/17#issuecomment-81862976.

dmcglinn commented 9 years ago

It looks like the cleaned gbif flat file has 38,143,293 rows compared to the original file which had 78,669,155 rows. So 48% of the data was retained.

SallyArchibald commented 9 years ago

38 million! Okidoki. Centre for High Performance Computing here i come. On 17 Mar 2015 02:54, "Dan McGlinn" notifications@github.com wrote:

It looks like the cleaned gbif flat file has 38,143,293 rows compared to the original file which had 78,669,155 rows. So 48% of the data was retained.

— Reply to this email directly or view it on GitHub https://github.com/Fireandplants/plant_gbif/pull/17#issuecomment-82008914 .

dmcglinn commented 9 years ago

The download links for exported files are as follows:

I'll keep these links active until 03/22/15.

dschwilk commented 9 years ago

Great!

On 03/17/2015 03:21 AM, Sally Archibald wrote:

38 million! Okidoki. Centre for High Performance Computing here i come. On 17 Mar 2015 02:54, "Dan McGlinn" notifications@github.com wrote:

It looks like the cleaned gbif flat file has 38,143,293 rows compared to the original file which had 78,669,155 rows. So 48% of the data was retained.

— Reply to this email directly or view it on GitHub

https://github.com/Fireandplants/plant_gbif/pull/17#issuecomment-82008914 .

— Reply to this email directly or view it on GitHub https://github.com/Fireandplants/plant_gbif/pull/17#issuecomment-82196775.

dschwilk commented 9 years ago

@SallyArchibald : Have you had a chance to look at these data and start the fire-regime analyses? I suggest that the code for that work goes in the "bigphylo" repository. We can change that name as that is not great, but this next step is more specific to the actual biological questions we are asking and we can leave this plants_gbif repo as the more general code for dealing with taxon name matching and extraction of gbif data.

SallyArchibald commented 9 years ago

Hi dylan. I finish lecturing today and will get started next week. I ran a trial with a few thousand points a while back wish worked ok.

Sure about the code. Just need to work out the github code sharing rules. Sally On 26 Mar 2015 18:41, "Dylan Schwilk" notifications@github.com wrote:

@SallyArchibald https://github.com/SallyArchibald : Have you had a chance to look at these data and start the fire-regime analyses? I suggest that the code for that work goes in the "bigphylo" repository. We can change that name as that is not great, but this next step is more specific to the actual biological questions we are asking and we can leave this plants_gbif repo as the more general code for dealing with taxon name matching and extraction of gbif data.

— Reply to this email directly or view it on GitHub https://github.com/Fireandplants/plant_gbif/pull/17#issuecomment-86611825 .

dmcglinn commented 9 years ago

let us know if you have any questions about how to contribute code on github - it can seem a bit daunting at first. Here is a basic outline of the workflow that is generally suggested for collaboration:

  1. fork plant_gbif by going to the repo page and clicking on the "fork" button at the top right. This will copy the repo to your personal account.
  2. clone the repo to your local machine using the command git clone git@github.com:SallyArchibald/plant_gbif.git
  3. create a new branch to will contain your changes using the command git branch query_fire where "query_fire" is whatever name you decide to call your branch - typically a string that suggests what the branch is meant to accomplish
  4. checkout the branch using git checkout query_fire
  5. add the code you have created to query the fire shape files using git add mycode.R
  6. commit your changes using git commit -m "Adds scripts for querying the fire layers" where the information in quotes is your commit message
  7. push your changes using git push origin query_fire this will put your local changes to your query_fire branch online at your personal github page.
  8. via your web-browser go to your personal copy of plant_gbif and navigate to your branch "query_fire" once you're on that branch there should be a button on the right called "Pull request"
  9. submit your pull request with any relevant information this will send all of us an email so that we can review your code and merge it back into the main repository.