anidata / ht-etl

Anidata 1.0: ETL and algorithm code.
0 stars 10 forks source link

Extracting external sites #14

Closed bmenn closed 7 years ago

bmenn commented 7 years ago

We need functionality to identify other potential sites to scrape.

bmenn commented 7 years ago

If someone get to this before I get back to this, we could use SQLAlchemy data model objects to simplify data access (but this does not mean less code).

http://docs.sqlalchemy.org/en/latest/orm/extensions/automap.html?highlight=automap#module-sqlalchemy.ext.automap

coveralls commented 7 years ago

Coverage Status

Changes Unknown when pulling 836cb122d4625b6288ba9de09e1d096b3c887c5a on external_sites into on master.

coveralls commented 7 years ago

Coverage Status

Changes Unknown when pulling 6eef632c7cb8f84d87c41abab0c94d01cce88e6c on external_sites into on master.

coveralls commented 7 years ago

Coverage Status

Changes Unknown when pulling ba149d7e90da99281ebaccc6886db1ab65d04e30 on external_sites into on master.

bmenn commented 7 years ago

@dlrobertson @egrossman

Not 100% done here, but almost. Need to add a simple integration test and I'll be good. There's also an example of unit testing a Luigi task as well.

coveralls commented 7 years ago

Coverage Status

Changes Unknown when pulling 5bfee4426cdc65434e5c8eac4f7456337a75da06 on external_sites into on master.

coveralls commented 7 years ago

Coverage Status

Changes Unknown when pulling 897fee8f2782b4e34d3e887af61d6c33e9d99f78 on external_sites into on master.

coveralls commented 7 years ago

Coverage Status

Changes Unknown when pulling cc7be25dcf494660bd7c270965e9fb542d66e018 on external_sites into on master.

coveralls commented 7 years ago

Coverage Status

Changes Unknown when pulling 83dc075875aa700faf77d679aa242da9cec23303 on external_sites into on master.

coveralls commented 7 years ago

Coverage Status

Changes Unknown when pulling 0f3980ce1f6f5a76d4935e5e2ebacf861a267d91 on external_sites into on master.