simonw / datasette

An open source multi-tool for exploring and publishing data
https://datasette.io
Apache License 2.0
9.07k stars 648 forks source link

Foreign key links break for compound foreign keys #1098

Closed simonw closed 3 years ago

simonw commented 3 years ago

Reported on Twitter here: https://twitter.com/ZaneSelvans/status/1328093641395548161

Maybe I'm doing something wrong here but the automatically generated links based foreign key relationships seem to be working here for utility_id_eia, but not for plant_id_eia & generator_id which seems odd: https://pudl-datasette-xl7xwcpe2a-uc.a.run.app/pudl/generators_eia860

Right now it seems like they're trying to, but with only one of the two keys, so it gives "Error 500. You did not supply a value for binding 2." Maybe only create the links when it's a simple foreign key?

simonw commented 3 years ago

Need to replicate this in the fixtures, then fix it.

simonw commented 3 years ago

Schema for that broken example:

CREATE TABLE generators_eia860 (
    id INTEGER NOT NULL, 
    plant_id_eia INTEGER, 
    generator_id TEXT, 
    report_date DATE, 
    operational_status_code TEXT, 
    operational_status TEXT, 
    ownership_code TEXT, 
    utility_id_eia INTEGER, 
    capacity_mw FLOAT, 
    summer_capacity_mw FLOAT, 
    winter_capacity_mw FLOAT, 
    energy_source_code_1 TEXT, 
    energy_source_code_2 TEXT, 
    energy_source_code_3 TEXT, 
    energy_source_code_4 TEXT, 
    energy_source_code_5 TEXT, 
    energy_source_code_6 TEXT, 
    fuel_type_code_pudl TEXT, 
    multiple_fuels BOOLEAN, 
    deliver_power_transgrid BOOLEAN, 
    syncronized_transmission_grid BOOLEAN, 
    turbines_num INTEGER, 
    planned_modifications BOOLEAN, 
    planned_net_summer_capacity_uprate_mw FLOAT, 
    planned_net_winter_capacity_uprate_mw FLOAT, 
    planned_uprate_date DATE, 
    planned_net_summer_capacity_derate_mw FLOAT, 
    planned_net_winter_capacity_derate_mw FLOAT, 
    planned_derate_date DATE, 
    planned_new_prime_mover_code TEXT, 
    planned_energy_source_code_1 TEXT, 
    planned_repower_date DATE, 
    other_planned_modifications BOOLEAN, 
    other_modifications_date DATE, 
    planned_retirement_date DATE, 
    carbon_capture BOOLEAN, 
    startup_source_code_1 TEXT, 
    startup_source_code_2 TEXT, 
    startup_source_code_3 TEXT, 
    startup_source_code_4 TEXT, 
    technology_description TEXT, 
    turbines_inverters_hydrokinetics TEXT, 
    time_cold_shutdown_full_load_code TEXT, 
    planned_new_capacity_mw FLOAT, 
    cofire_fuels BOOLEAN, 
    switch_oil_gas BOOLEAN, 
    nameplate_power_factor FLOAT, 
    minimum_load_mw FLOAT, 
    uprate_derate_during_year BOOLEAN, 
    uprate_derate_completed_date DATE, 
    current_planned_operating_date DATE, 
    summer_estimated_capability_mw FLOAT, 
    winter_estimated_capability_mw FLOAT, 
    retirement_date DATE, 
    PRIMARY KEY (id), 
    FOREIGN KEY(plant_id_eia, generator_id) REFERENCES generators_entity_eia (plant_id_eia, generator_id), 
    FOREIGN KEY(utility_id_eia) REFERENCES utilities_entity_eia (utility_id_eia), 
    CHECK (multiple_fuels IN (0, 1)), 
    CHECK (deliver_power_transgrid IN (0, 1)), 
    CHECK (syncronized_transmission_grid IN (0, 1)), 
    CHECK (planned_modifications IN (0, 1)), 
    CHECK (other_planned_modifications IN (0, 1)), 
    CHECK (carbon_capture IN (0, 1)), 
    CHECK (cofire_fuels IN (0, 1)), 
    CHECK (switch_oil_gas IN (0, 1)), 
    CHECK (uprate_derate_during_year IN (0, 1))
);

https://pudl-datasette-xl7xwcpe2a-uc.a.run.app/pudl/generators_entity_eia is:

CREATE TABLE generators_entity_eia (
    plant_id_eia INTEGER NOT NULL, 
    generator_id TEXT NOT NULL, 
    prime_mover_code TEXT, 
    duct_burners BOOLEAN, 
    operating_date DATE, 
    topping_bottoming_code TEXT, 
    solid_fuel_gasification BOOLEAN, 
    pulverized_coal_tech BOOLEAN, 
    fluidized_bed_tech BOOLEAN, 
    subcritical_tech BOOLEAN, 
    supercritical_tech BOOLEAN, 
    ultrasupercritical_tech BOOLEAN, 
    stoker_tech BOOLEAN, 
    other_combustion_tech BOOLEAN, 
    bypass_heat_recovery BOOLEAN, 
    rto_iso_lmp_node_id TEXT, 
    rto_iso_location_wholesale_reporting_id TEXT, 
    associated_combined_heat_power BOOLEAN, 
    original_planned_operating_date DATE, 
    operating_switch TEXT, 
    previously_canceled BOOLEAN, 
    PRIMARY KEY (plant_id_eia, generator_id), 
    FOREIGN KEY(plant_id_eia) REFERENCES plants_entity_eia (plant_id_eia), 
    CHECK (duct_burners IN (0, 1)), 
    CHECK (solid_fuel_gasification IN (0, 1)), 
    CHECK (pulverized_coal_tech IN (0, 1)), 
    CHECK (fluidized_bed_tech IN (0, 1)), 
    CHECK (subcritical_tech IN (0, 1)), 
    CHECK (supercritical_tech IN (0, 1)), 
    CHECK (ultrasupercritical_tech IN (0, 1)), 
    CHECK (stoker_tech IN (0, 1)), 
    CHECK (other_combustion_tech IN (0, 1)), 
    CHECK (bypass_heat_recovery IN (0, 1)), 
    CHECK (associated_combined_heat_power IN (0, 1)), 
    CHECK (previously_canceled IN (0, 1))
);
simonw commented 3 years ago

For the moment I'm going to solve this by teaching Datasette's internal introspection methods - in particular these ones - to ignore compound foreign keys entirely:

https://github.com/simonw/datasette/blob/e800ffcf7cc6a915eb554b369c654f87162575e5/datasette/utils/__init__.py#L470-L505

simonw commented 3 years ago

Fix is out in 0.52.1: https://docs.datasette.io/en/latest/changelog.html#v0-52-1

simonw commented 3 years ago

Demo of the fix: https://latest.datasette.io/fixtures/foreign_key_references (the compound foreign key columns do not link to anything)