ReproducibleQM / NES

The National Eutrophication Survey: lake characteristics and historical nutrient concentrations
https://doi.org/10.5063/F1CZ35HF
Mozilla Public License 2.0
4 stars 4 forks source link

Data #5

Closed jsta closed 7 years ago

jsta commented 7 years ago

@dustinkincaid @nagelki4 @kingka21 @mill2735

Hey Everyone,

I ran my optical character recognition algorithm on the pdf yesterday in my first attempt at creating the full dataset. You can see the result at: https://raw.githubusercontent.com/jsta/nesR/master/vignettes/res.csv

jsta commented 7 years ago

The following paper has NES data in its supplement: Stomp, M., Huisman, J., Mittelbach, G.G., Litchman, E. and Klausmeier, C.A., 2011. Large‐scale biodiversity patterns in freshwater phytoplankton. Ecology, 92(11), pp.2096-2107.

kingka21 commented 7 years ago

Is this all of the data or just some of it that they used? It's a good start. 

Quoting Joseph Stachelek notifications@github.com:

The following paper has NES data in its supplement: Stomp, M., Huisman, J., Mittelbach, G.G., Litchman, E. and Klausmeier, C.A., 2011. Large‐scale biodiversity patterns in freshwater phytoplankton. Ecology, 92(11), pp.2096-2107.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/ReproducibleQM/gReen2O/issues/5#issuecomment-274853978

jsta commented 7 years ago

From what I can all the lakes are there. However, the dataset contains no dates and the columns are incomplete (there is no information on lake residence time, inflow, or drainage area). I'm thinking we could use this dataset to auto QA the lake names which seems to cause the greatest difficulty for the OCR.

I also wonder where the coordinates came from because they are not in pdfs and don't match checks with this package.

kingka21 commented 7 years ago

I just tried installing this through my MSU google.docs account, and I'm blocked. I can install from my personal account just fine. Did you have the same problem? If I've just shared the google.doc folder with my personal email -> Heather.Miller@net.elmhurst.edu. Thanks! Heather

 Hi Everyone, 

The first session we were talking about how to cite things in google docs. A friend told me about this awesome add in for google docs! I tested it out and it is pretty sweet.  It also got good reviews.  You can search for your article, click cite, and chose your citation format. The add-in will add the cite and build the bibliography as well! So we should be able to cite our paper as we go, which will save time at the end. 

https://chrome.google.com/webstore/detail/paperpile/imanmdcibgaflfaibbcmmkifdgllfopm?hl=en

Katelyn 

nagelkirk commented 7 years ago

Sweet!

On Thu, Jan 26, 2017 at 1:21 PM, kingka21 notifications@github.com wrote:

Hi Everyone,

The first session we were talking about how to cite things in google docs. A friend told me about this awesome add in for google docs! I tested it out and it is pretty sweet. It also got good reviews. You can search for your article, click cite, and chose your citation format. The add-in will add the cite and build the bibliography as well! So we should be able to cite our paper as we go, which will save time at the end.

https://chrome.google.com/webstore/detail/paperpile/ imanmdcibgaflfaibbcmmkifdgllfopm?hl=en

Katelyn

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ReproducibleQM/gReen2O/issues/5#issuecomment-275467711, or mute the thread https://github.com/notifications/unsubscribe-auth/APJKe5rXOApB0QOOCxdvAIPTH4Z9dTDtks5rWOQggaJpZM4LsQWd .

jsta commented 7 years ago

@kingka21 @nagelki4 @dustinkincaid @mill2735 @chanse-ford

I have finished the automated pdf scraping. You can find the data for the respective pdfs at:

https://github.com/jsta/nesR/blob/master/474/res.csv https://github.com/jsta/nesR/blob/master/475/res.csv https://github.com/jsta/nesR/blob/master/476/res.csv https://github.com/jsta/nesR/blob/master/477/res.csv

kingka21 commented 7 years ago

Nice work! This is awesome and was obviously time consuming. 

Quoting Joseph Stachelek notifications@github.com:

@kingka21 @nagelki4 @dustinkincaid @mill2735 @chanse-ford

I have finished the automated pdf scraping. You can find the data for the respective pdfs at:

https://github.com/jsta/nesR/blob/master/474/res.csv https://github.com/jsta/nesR/blob/master/475/res.csv https://github.com/jsta/nesR/blob/master/476/res.csv https://github.com/jsta/nesR/blob/master/477/res.csv

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/ReproducibleQM/gReen2O/issues/5#issuecomment-276217017