UCL-ShippingGroup / pyrate

Python AIS Tools Environment
MIT License
25 stars 10 forks source link

<table>_db_spec fieldnames converted to lowercase when table is generated #16

Open ejokeeffe opened 8 years ago

ejokeeffe commented 8 years ago

For example... clean_db_spec = { 'cols': [ ('MMSI', 'integer'), ('Time', 'timestamp without time zone'), ('Message_ID', 'integer'), ('Navigational_status', 'integer'), ('SOG', double_type), ('Longitude', double_type), ('Latitude', double_type), ('COG', double_type), ('Heading', double_type), ('IMO', 'integer null'), ('Draught', double_type), ('Destination', 'character varying(255)'), ('Vessel_Name', 'character varying(255)'), ('ETA_month', 'integer'), ('ETA_day', 'integer'), ('ETA_hour', 'integer'), ('ETA_minute', 'integer'), ('source', 'smallint'), ('ID', 'SERIAL PRIMARY KEY') ], 'indices': [ ('dt_idx', ['Time']), ('imo_idx', ['IMO']), ('lonlat_idx', ['Longitude', 'Latitude']), ('mmsi_idx', ['MMSI']), ('msg_idx', ['Message_ID']), ('source_idx', ['source']), ('mmsi_imo_idx', ['MMSI','IMO']) ] }

The script that builds the sql string does not insert double quotes so the fieldnames are converted to lower case when the database table is being created. Only really becomes an issue when you look to pull from the database and access these fields in a dataframe using the defined fieldname constants MMSI = 'MMSI' TIME = 'Time' MESSAGE_ID = 'Message_ID' NAV_STATUS = 'Navigational_status' SOG = 'SOG' LONGITUDE = 'Longitude' So, for example df[MMSI] would fail (where df is a pandas dataframe). It would make sense to have them as lowercase or add the double quotes when building the database.

ejokeeffe commented 8 years ago

Looks like the code deliberately sets it to lower case columns = ["\"{}\" {}".format(c[0].lower(), c[1]) for c in self.cols]

Why have it as capitalised in the first place?