google-code-export / unm-macroecology-2012

Automatically exported from code.google.com/p/unm-macroecology-2012
1 stars 0 forks source link

Process data from ACUPCC website into spreadsheet #13

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I started on this -- here's a sample of 300 out of the 1500 available records.

The first file is the source code to do this.  The second is the actual data.

Again, the data is taken from here: http://rs.acupcc.org/stats/complete-ghg/

Can someone(s) take a look at this and let me know what you think?  Do you see 
any problems?

Original issue reported on code.google.com by icos.atr...@gmail.com on 31 Jan 2012 at 12:12

Attachments:

GoogleCodeExporter commented 9 years ago
Excellent. Glad we have your skills to help us with this project! If you need 
someone to help with the process, let me know. I'd be happy to scrape the next 
1200 if you tell me how it's done.

Mary Brandenburg

Original comment by marymai...@gmail.com on 31 Jan 2012 at 4:43

GoogleCodeExporter commented 9 years ago
I would also like to learn how use this technique if you have time to teach it 
to a novice.

Thank you,
Kevin

Original comment by SKMcCorm...@gmail.com on 31 Jan 2012 at 8:58

GoogleCodeExporter commented 9 years ago
For reference:
I mainly used this blog to write the script: 
http://palewire.com/posts/2008/04/20/python-recipe-grab-a-page-scrape-a-table-do
wnload-a-file/
The package is BeautifulSoup, here -- it's totally awesome, and has great 
documentation:
http://www.crummy.com/software/BeautifulSoup/documentation.html

@Kevin: Take a look at the source above if you're interested (scrape.py), I've 
added lots of comments, so it should be maybe at least a little bit 
self-explanatory.  I'm happy to talk about this sometime in the future.  
Python's a great tool to be familiar with, esp for this kind of thing.

@Mary -- I just need to fix a bug (DictWriter doesn't like unicode, apparently) 
and then it should get the rest of them!

Original comment by icos.atr...@gmail.com on 1 Feb 2012 at 6:28

GoogleCodeExporter commented 9 years ago
See attached for the full spreadsheet.
This is everything -- let me know if you encounter any issues with this. 

Original comment by icos.atr...@gmail.com on 1 Feb 2012 at 11:33

Attachments: