Closed tbuckl closed 9 years ago
Hi Tom. Assuming spandex understands the data types (to map them to Python types), you can edit the database outside of spandex. If you changed table schema while running spandex, you need run database.refresh()
.
I see two errors in the traceback:
Did not recognize type 'bpchar' of column 'county_id'
Did not recognize type 'unknown' of column 'imputation_flag'
bpchar (blank padded char) is an internal PostgreSQL type. Can you check on the data type of those columns? You could try explicitly casting to int or char as appropriate.
It might be helpful to uncomment the logger.warn call in spandex.database
.
FYI -
spandex recreates its knowledge of the schema each time the database.refresh()
method is called. That knowledge is only kept in memory and doesn't persist if you restart Python/spandex.
When spandex modifies table schema (for example to add a new column), it calls database.refresh()
. database.refesh()
iterates over every schema, table, column using SQLAlchemy/GeoAlchemy introspection/reflection.
For more information about reflection, see the SQLAlchemy documentation on Reflecting Database Objects and Using Reflection with Declarative.
Thanks @daradib! I appreciate the feedback and I'll try this out when I get a moment.
By uncommenting that logger.warn I was able to discover that the tables that were not showing up in the load.tables/database class were not being mapped because they did not have a primary key. Is this a GeoAlchemy convention (or requirement) for reflection
or a Spandex one? SQLAlchemy does not require this. The QGIS database adapter also has this requirement/convention. Perhaps its necessary? However, it clearly confused me. Anyway, thanks @daradib!
No problem! Hope you find spandex useful despite its quirks and limitations.
The primary key constraint is a SQLAlchemy ORM requirement: http://docs.sqlalchemy.org/en/latest/faq/ormconfiguration.html#how-do-i-map-a-table-that-has-no-primary-key
Its unclear to me as a user whether I can edit a database that Spandex is using/managing outside of the Spandex ORM without breaking the Spandex
database
class' understanding of the schema.In short, the Spandex
database
class does not seem to show any existing tables on mypublic
schema after dropping and then re-adding a table in psql and then inspecting the database with Spandexdatabase
class.The long version: This is a pretty simple and use-case-specific SQL query that takes a few minutes:
https://github.com/synthicity/spandex/blob/master/spandex/spatialtoolz.py#L495-L513
It works fine when you run this script from start to finish:
https://github.com/synthicity/bayarea_urbansim/blob/master/data_regeneration/run.py
Unfortunately, running that script from start to finish takes more than 12 hours on a well-provisioned (and tuned) machine.
As a user, it would be nice to be able to just call that specific SQL query on an arbitrary table. However, it seems that there may be some conventions or dependencies that I am not following in calling it.
In particular, I suspect that I am getting an error because I am not calling that function on a table that was specifically created or registered with one of the several ORM's (2 if you count Spandex as an ORM?) that seem to be in use in this repository.
The error is below. As a user, this means I will probably re-write the query from the ORM language into SQL in order to accomplish my larger goal of reducing the run-time of data regenerations.