18F / open-data-maker

make it easy to turn a lot of potentially large csv files into easily accessible open data
Other
199 stars 135 forks source link

Refactor Index process #294

Closed baccigalupi closed 8 years ago

baccigalupi commented 8 years ago

This is broken into a bunch of small files and the Import and RowImport each have a process method that gives a step by step definition of what is going on. Also easier to understand is the logging, but via normalizing a bit. There is a logging observer who looks for events and logs object with labels. It cuts down some on the unique snowflake overhead of the logging, but means there is no unique snowflake logging.

There need to be some tests for the logger, which I think isn't quite passing along args as expected. Would also be good to create unit tests for some of the other objects.

Will do that in future commits.

@ultrasaurus

pkarman commented 8 years ago

love the direction this is headed. much easier to track the logical flow.

pkarman commented 8 years ago

@baccigalupi I went ahead and based #295 off this branch. We can chat about it tomorrow.

baccigalupi commented 8 years ago

I just added some unit tests to this for the stuff that felt pretty good as stand alone objects. Outstanding concerns:

There are methods for preprocessing the data in the BuilderData and also the DocumentBuilder. Seems like they should be joined in some way, and maybe the logic in Document which is a decorator around the finished data.

I think it will take disassembling the DocumentBuilder to figure that out.

baccigalupi commented 8 years ago

@ultrasaurus @pkarman This is a pretty huge pull request. I think there is more to do, but would rather do it in other commits. What are next steps to get this merged?

pkarman commented 8 years ago

/me looks at @ultrasaurus

ultrasaurus commented 8 years ago

I will review. I renamed the PR

FYI @baccigalupi @pkarman our convention is WIP = not ready to merge, just wanting people to look at it :)