weecology / retriever

Quickly download, clean up, and install public datasets into a database management system
http://data-retriever.org
Other
306 stars 134 forks source link

BBS import in Access includes only the last state's data #52

Closed ethanwhite closed 12 years ago

ethanwhite commented 12 years ago

Latest .exe for v1.4. Only the rows from Yukon are present in the Counts table. This is presumably due to the fact that the table is being recreated for every state (see command line output below). This does not happen using the same .exe for Postgres.

Inserting data from Yukon...
Inserting data from Yukon.csv...

INSERT INTO [BBS counts] (record_id, countrynum, statenum, Route, RPID, Year, Ao
u, Count10, Count20, Count30, Count40, Count50, StopTotal, SpeciesTotal)
SELECT * FROM [Yukon_new.csv]
IN "C:\Users\ethan\Desktop\raw_data\BBS" "Text;FMT=CSVDelimited;HDR=Yes;"
Couldn't bulk insert. Trying manual insert.
Creating table [BBS counts]...
Inserting rows to [BBS counts]: 5436834 / 5436834

The raw data folder also ends up with a bunch of State_new.csv files that also do not exist for other engines. Use of these files can be seen in the output above, and on one occasion I witnessed the retriever try to import both the State.csv and State_new.csv files resulting in importing twice as many rows at the final count. I'm guessing this is all related, but we can split the issue if it isn't.

ethanwhite commented 12 years ago

Looks like maybe an extra call to create_table() on line 121 of msaccess.py.

I don't have access to a Windows development machine at the moment, but I'll make the change and test it in a couple of hours unless Ben beats me to it.

ethanwhite commented 12 years ago

Fixed in master in b34201d393356b5c95df234af8309c0ad87f99ed. Doing one more round of testing before changing in 1.4.

ethanwhite commented 12 years ago

Note that the existence of the _new.csv files is a separate issue - #53.

ethanwhite commented 12 years ago

The doubled import counts are also a separate issue - #55.

ethanwhite commented 12 years ago

Cherry picked into v1.4 and recompiled the executable 20a67be3a8a7250a70dc24bab48acbb141fcb67b.