OpenGen / GenSQL.inference

Apache License 2.0
2 stars 2 forks source link

Allow for overlapping constraints and targets in ClojureCat's `logpdf` and `simulate` #17

Closed cameronfreer closed 1 year ago

cameronfreer commented 2 years ago

Running IQL publish a1d965d in strict mode, the following code yields an error on gapminder data & model from IQL analyses. Running the cell

SELECT * FROM
  GENERATE
    *
  UNDER baseline_model
  CONDITIONED BY VAR privately_owned_forest_percent = 60
  AND VAR female_contributing_family_workers_percent = 10
LIMIT 10

yields a pop-up "Internal query execution error" message along with along stack trace in the terminal which begins

clojure.lang.ExceptionInfo: Targets and constraints must be unique! These are shared: (:privately_owned_forest_percent :female_contributing_family_workers_percent)

According to @Schaechtle in https://probcomp.slack.com/archives/C01C5LW3WPN/p1662496584594669?thread_ts=1662495530.926179&cid=C01C5LW3WPN this is "definitely residue from Nick's port of Python-CGPM to ClojureCat. It should not happen (and won't if you use SPPL as a backend)."

cameronfreer commented 2 years ago

full error:

ERROR "/api/query"
clojure.lang.ExceptionInfo: Targets and constraints must be unique! These are shared: (:privately_owned_forest_percent :female_contributing_family_workers_percent)
{:targets (:one_year_olds_immunized_with_three_doses_of_hepatitis_b_hepb3_percent :forest_area_sq_km :total_number_of_billionaires :population_density_per_square_km :car_mortality_per_100000_age_adjusted :storm_affected :suicide_15to29_all_age_adj :underweight_children :tb_incidence_smear_positive_per_100_000_population_per_year :earthquake_killed :ifpri_underweight_children :female_15to19_years_percent :air_accidents_killed :people_living_with_hiv :water_and_sanitation_aid_percent_of_total_aid :sex_ratio_15to49 :prevalence_of_current_tobacco_use_among_adults_larger_or_equal15_years_percent_female :armed_forces_personnel :hydro_production_per_person_toe :tax_revenue_percent_of_gdp :residential_electricity_consumption_per_person_kwh :external_debt_stocks_percent_of_gni :economical_infrastructure_aid_percent_of_total :rti_60plus_all_age_adj :health_aid_percent_of_total_aid :agriculture_value_added_percent_of_gdp :debt_servicing_costs_percent_of_exports_and_net_income_from_abroad :primary_completion_rate_total_percent_of_relevant_age_group :new_smear_positive_cases_per_100_000_population :gdp_per_working_hour_constant_1990usd :air_accidents_affected :exports_unit_value_index_2000equals100 :mean_years_in_school_men_25_years_and_older :total_55plus_unemployment_percent :biomass_stock_in_forest_ton :income_share_held_by_fourth_20percent :arms_exports_constant_1990_ususd :tb_incidence_all_forms_in_hiv_positive_per_100_000_population_per_year :infant_mortality_rate :working_hours_per_week :road_traffic_total_deaths :female_20to39_years_percent :internet_users_per_100_people :homicide_45to59_all_age_adj :whole_country_new_smear_positive_case_detection_rate_percent :per_capita_government_expenditure_on_health_at_average_exchange_rate_ususd :contraceptive_prevalence_percent_of_women_ages_15to49 :proportion_of_the_population_using_improved_drinking_water_sources_urban :dependency_ratio :coal_consumption_per_person_tonnes_oil_equivalent :homicide_30to44_all_age_adj :merchandise_trade_percent_of_gdp :out_of__expenditure_as_percentage_of_total_health_expenditure :income_share_held_by_third_20percent :foreign_direct_investment_net_outflows_percent_of_gdp :tb_prevalence_all_forms_per_year :population_policies_aid_percent_of_total :female_salaried_employees_percent :proportion_of_the_population_using_improved_sanitation_facilities_urban :poverty_headcount_ratio_at_usd2_a_day_ppp_percent_of_population :per_capita_total_expenditure_on_health_at_average_exchange_rate_ususd :tb_incidence_all_forms_in_hiv_positive_per_year :male_5to9_years_percent :male_15plus_unemployment_percent :reported_cases :male_15to24_unemployment_percent :gross_capital_formation_percent_of_gdp :external_debt_stocks_total_dod_current_ususd :extreme_temp_affected :co2_emissions_kg_per_2005_ppp_usd_of_gdp :proportion_of_the_population_using_improved_sanitation_facilities_rural :gni_per_capita_constant_2000_ususd :number_of_child_deaths :total_25to54_unemployment_percent :male_contributing_family_workers_percent :total_allocable_aid_2007_ususd :mean_years_in_school_women_of_reproductive_age_15_to_44 :total_population_male :total_above_15_employment_to_population_percent :forest_products_per_ha_usd :personal_computers :poverty_headcount_ratio_at_national_poverty_line_percent_of_population :lifetime_risk_per_1000_of_maternal_deaths :population_growth_annual_percent :physicians_per_1000_people :total_population_female :ratio_of_young_literate_females_to_males_percent_ages_15to24 :teen_fertility :gni_per_capita_ppp_current_international_usd :forest_plantation_area_ha :dead_kids_per_woman :total_5to9_years_percent :forest_coverage_percent :urban_population_percent_of_total :rti_15to29_all_age_adj :aid_received_percent_of_gni :homicide_0to14_all_age_adj :fixed_line_and_mobile_phone_subscribers_per_100_people :murder_per_100000_age_adjusted :mobile_cellular_subscriptions_per_100_people :male_above_60_percent :male_0to4_years_percent :trade_balance_percent_of_gdp :sex_ratio_50plus :general_government_expenditure_on_health_as_percentage_of_total_government_expenditure :suicide_age_adjusted_per_100_000_standard_population :tb_mortality_all_forms_in_hiv_positive_per_100_000_population_per_year :children_per_woman :gdp_per_capita_pwt_71 :male_10to14_years_percent :total_15to19_years_percent :suicide_60plus_all_age_adj :two_wheeler_motorized_mortality_per_100000 :motor_vehicles_not_two_wheelers_per_1000_population :imports_of_goods_and_services_percent_of_gdp :net_barter_terms_of_trade_2000_equals_100 :electricity_generation_per_person_kilowatt_hours :roads_paved_percent_of_total_roads :male_service_workers_percent :market_capitalization_of_listed_companies_percent_of_gdp :female_5to9_years_percent :income_share_held_by_lowest_20percent :gdp_per_capita_ppp_with_projections :male_long_term_unemployment_percent :services_etc_value_added_percent_of_gdp :forest_land_ha :gdp_per_capita_ppp_2005usd :female_agriculture_workers_percent :male_salaried_employees_percent :extreme_temp_killed :total_65plus_labour_to_population_percent :suicide_45to59_all_age_adj :dots_new_smear_positive_case_detection_rate_percent :tb_incidence_smear_positive_per_year :hiv_incidence_percent_ages_15to49 :earthquake_affected :surviving_kids_per_woman :gdp_per_employee_constant_1990usd :children_out_of_school_primary :mmr :pc_per_100 :tb_mortality_all_forms_per_year :proportion_of_the_population_using_improved_drinking_water_sources_total :proportion_of_the_population_using_improved_sanitation_facilities_total :male_agriculture_workers_percent :expenditure_per_student_tertiary_percent_of_gdp_per_capita :total_25to54_labour_to_population_percent :agricultural_land_percent_of_land_area :cross_sectors_aid_percent_of_total_aid :reported_deaths :female_long_term_unemployment_percent :tb_new_and_relapse_cases_per_100_000_population :population_aged_40to59_years_total_number :total_15plus_unemployment_percent :female_industry_workers_percent :mean_years_in_school_women_25_years_and_older :high_technology_exports_percent_of_manufactured_exports :total_40to59_years_percent :tb_prevalence_all_forms_per_100_000_population_per_year :oil_consumption_per_capita_tonnes_per_year :military_expenditure_percent_of_gdp :surface_area_sq_km :age_at_1st_marriage_women :hdi :body_mass_index_bmi_men_kgperm2 :rti_0to14_all_age_adj :poverty_headcount_ratio_at_rural_poverty_line_percent_of_rural_population :total_gni_ppp_current_international_usd :stunt_0to4_unicef_who :sbp_male_mm_hg_age_standardized_mean :total_sex_ratio :education_aid_percent_of_total_aid :reported_cases_per_100000 :gdp_pc_test :female_40to59_years_percent :total_agriculture_workers_percent :extreme_poverty_percent_people_below_usd125_a_day :primary_completion_rate_male_percent_of_relevant_age_group :income_share_held_by_second_20percent :population_in_urban_agglomerations_of_more_than_1_million_percent_of_total_population :total_0to4_years_percent :youth_15to24_literacy_rate_percent_total :residential_electricity_consumption_total_kwh :production_sector_aid_percent_of_total_aid :one_year_olds_immunized_with_mcv_percent :total_contributing_family_workers_percent :male_self_employed_percent :industry_value_added_percent_of_gdp :growth_next_10_years :drought_affected :expenditure_per_student_secondary_percent_of_gdp_per_capita :female_service_workers_percent :natural_gas_production_per_person_tonnes_oil_equivalent :other_social_services_aid_percent_of_total_aid :wood_removal_cubic_meters :inflation_gdp_deflator_annual_percent :reported_deaths_per_100000 :average_age_of_billionaires :total_salaried_employees_percent :aid_received_per_person_current_ususd :tc_male_mmolperl_age_standardized_mean :births_attended_by_skilled_health_staff_percent_of_total :tb_prevalence_all_forms_in_hiv_positive_per_100_000_population_per_year :prevalence_of_current_tobacco_use_among_adults_larger_or_equal15_years_percent_male :estimated_art_coverage_cd4_smaller_350 :tb_incidence_all_forms_per_100_000_population_per_year :storm_killed :gdp_per_capita_growth_annual_percent :total_above_60_percent :tb_mortality_all_forms_per_100_000_population_per_year :total_industry_workers_percent :electricity_consumption_per_capita_kwh :government_and_civil_society_aid_percent_of_total_aid :hourly_compensation_ususd :natural_gas_proved_reserves_total_tonnes_oil_equivalent :total_reserves_percent_of_total_external_debt :general_government_expenditure_on_health_as_percentage_of_total_expenditure_on_health :adult_15plus_literacy_rate_percent_total :import_value_index_2000_equals_100 :dots_population_coverage_percent :crude_oil_production_per_capita_toe :trade_balance_current_ususd :poverty_headcount_ratio_at_usd125_a_day_ppp_percent_of_population :fixed_broadband_internet_subscribers_per_100_people :total_20to39_years_percent :male_15to19_years_percent :ratio_of_girls_to_boys_in_primary_and_secondary_education_percent :alcohol_consumption_per_adult_15plus_litres :rti_30to44_all_age_adj :suicide_30to44_all_age_adj :median_age :flood_killed :traffic_mortality_per_100000_age_adjusted :forest_products_total_usd :energy_use_per_capita_toe :whole_country_all_new_case_detection_rate_percent :tb_prevalence_all_forms_in_hiv_positive_per_year :flood_affected :oil_proved_reserves_per_person_tonnes :natural_gas_proved_reserves_per_person_tonnes_oil_equivalent :income_share_held_by_lowest_10percent :one_year_olds_immunized_with_three_doses_of_diphtheria_tetanus_toxoid_and_pertussis_dtp3_percent :total_15to24_employment_to_population_percent :sex_ratio_15to24 :sex_ratio_0to14 :male_55plus_unemployment_percent :gdp_constant_2000_ususd :primary_school_completion_percent_of_girls :homicide_15to29_all :privately_owned_forest_percent :female_contributing_family_workers_percent :female_0to4_years_percent :private_expenditure_on_health_as_percentage_of_total_expenditure_on_health :male_25to54_unemployment_percent :total_expenditure_on_health_as_percentage_of_gdp_gross_domestic_product :armed_forces_personnel_percent_of_labor_force_wdi_society_war_and_peace :malnutrition_prevalence_weight_for_age_percent_of_children_under_5 :food_supply_kilocalories_per_person_and_day :dots_treatment_success_percent :democracy_polityiv :tc_female_mmolperl_age_standardized_mean :total_15to24_unemployment_percent :one_year_olds_immunized_with_three_doses_of_hib_hib3_vaccine_percent :foreign_direct_investment_net_inflows_percent_of_gdp :male_20to39_years_percent :income_share_held_by_highest_20percent :subsistence_incomes_per_person :female_above_60_percent :female_10to14_years_percent :suicide_0to14_all_age_adj :urban_population_growth_annual_percent :estimated_hiv_prevalencepercent_ages_15to49 :total_gdp_pppusd_inflation_adjusted :sbp_female_mm_hg_age_standardized_mean :tb_mortality_all_forms_in_hiv_positive_per_year :male_industry_workers_percent :tb_new_and_relapse_cases :dots_all_new_case_detection_rate_percent :life_expectancy_at_birth :total_population :long_term_unemployment_rate_percent :central_bank_discount_rate_annual_percent :per_capita_total_expenditure_on_health_ppp_int_usd :epidemic_affected :total_service_workers_percent :nuclear_production_per_person_toe :total_self_employed_percent :expenditure_per_student_primary_percent_of_gdp_per_capita :under_5_mortality_rate :annual_number_of_aids_deaths :tb_incidence_all_forms_per_year :drought_killed :homicide_60plus_all_age_adj :billionaires_per_million_inhabitants :co2_emission_per_person_metric_tons :oda_received_total_constant_2010_ususd :income_share_held_by_highest_10percent :percent_solid_biofuels_in_total_energy_supply :natural_gas_production_total_tonnes_oil_equivalent :primary_forest_land_ha :rti_45to59_all_age_adj :epidemic_killed :gni_per_capita_atlas_method_current_ususd :arms_imports_constant_1990_ususd :exports_of_goods_and_services_percent_of_gdp :prevalence_of_current_tobacco_use_among_adults_larger_or_equal15_years_percent_both_sexes :income_per_person :total_10to14_years_percent :per_capita_government_expenditure_on_health_ppp_int_usd :poverty_headcount_ratio_at_urban_poverty_line_percent_of_urban_population :oda_percent_gni :proportion_of_the_population_using_improved_drinking_water_sources_rural :female_self_employed_percent :neonates_protected_at_birth_against_neonatal_tetanus_pab_percent :privately_owned_wooded_land_percent :gini_index :male_40to59_years_percent), :constraints {:privately_owned_forest_percent 60, :female_contributing_family_workers_percent 10}}
 at inferenceql.inference.gpm.crosscat.XCat.simulate (crosscat.cljc:95)
    inferenceql.inference.gpm.conditioned.ConditionedGPM.simulate (conditioned.cljc:12)
    inferenceql.inference.gpm$simulate.invokeStatic (gpm.cljc:119)
    inferenceql.inference.gpm$simulate.invoke (gpm.cljc:116)
    inferenceql.query.plan$fn__15560$fn__15564.invoke (plan.cljc:526)
    clojure.core$repeatedly$fn__6531.invoke (core.clj:5174)
    clojure.lang.LazySeq.sval (LazySeq.java:42)
    clojure.lang.LazySeq.seq (LazySeq.java:51)
    clojure.lang.RT.seq (RT.java:535)
    clojure.core$seq__5467.invokeStatic (core.clj:139)
    clojure.core$map$fn__5935.invoke (core.clj:2763)
    clojure.lang.LazySeq.sval (LazySeq.java:42)
    clojure.lang.LazySeq.seq (LazySeq.java:51)
    clojure.lang.LazySeq.withMeta (LazySeq.java:36)
    clojure.lang.LazySeq.withMeta (LazySeq.java:17)
    clojure.core$with_meta__5485.invokeStatic (core.clj:220)
    clojure.core$vary_meta.invokeStatic (core.clj:677)
    clojure.core$vary_meta.doInvoke (core.clj:677)
    clojure.lang.RestFn.invoke (RestFn.java:464)
    inferenceql.query.relation$relation.invokeStatic (relation.cljc:19)
    inferenceql.query.relation$relation.doInvoke (relation.cljc:10)
    clojure.lang.RestFn.invoke (RestFn.java:439)
    inferenceql.query.plan$fn__15560.invokeStatic (plan.cljc:527)
    inferenceql.query.plan/fn (plan.cljc:516)
    clojure.lang.MultiFn.invoke (MultiFn.java:239)
    inferenceql.query.plan$fn__15550.invokeStatic (plan.cljc:501)
    inferenceql.query.plan/fn (plan.cljc:498)
    clojure.lang.MultiFn.invoke (MultiFn.java:239)
    inferenceql.query.base$query.invokeStatic (base.cljc:24)
    inferenceql.query.base$query.invoke (base.cljc:12)
    inferenceql.query.strict$query.invokeStatic (strict.cljc:9)
    inferenceql.query.strict$query.invoke (strict.cljc:6)
    inferenceql.publish$query_handler$fn__19561.invoke (publish.clj:128)
    ring.middleware.format_params$wrap_format_params$fn__18514.invoke (format_params.clj:90)
    ring.middleware.format_response$wrap_format_response$fn__19386.invoke (format_response.clj:194)
    ring.middleware.format_response$wrap_format_response$fn__19386.invoke (format_response.clj:194)
    reitit.ring$ring_handler$fn__17536.invoke (ring.cljc:329)
    reitit.ring.middleware.exception$wrap$fn__17840$fn__17841.invoke (exception.clj:49)
    clojure.lang.AFn.applyToHelper (AFn.java:154)
    clojure.lang.AFn.applyTo (AFn.java:144)
    clojure.lang.AFunction$1.doInvoke (AFunction.java:31)
    clojure.lang.RestFn.invoke (RestFn.java:408)
    ring.adapter.jetty$proxy_handler$fn__17977.invoke (jetty.clj:27)
    ring.adapter.jetty.proxy$org.eclipse.jetty.server.handler.AbstractHandler$ff19274a.handle (:-1)
    org.eclipse.jetty.server.handler.HandlerWrapper.handle (HandlerWrapper.java:127)
    org.eclipse.jetty.server.Server.handle (Server.java:516)
    org.eclipse.jetty.server.HttpChannel.lambda$handle$1 (HttpChannel.java:400)
    org.eclipse.jetty.server.HttpChannel.dispatch (HttpChannel.java:645)
    org.eclipse.jetty.server.HttpChannel.handle (HttpChannel.java:392)
    org.eclipse.jetty.server.HttpConnection.onFillable (HttpConnection.java:277)
    org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded (AbstractConnection.java:311)
    org.eclipse.jetty.io.FillInterest.fillable (FillInterest.java:105)
    org.eclipse.jetty.io.ChannelEndPoint$1.run (ChannelEndPoint.java:104)
    org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask (EatWhatYouKill.java:338)
    org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce (EatWhatYouKill.java:315)
    org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce (EatWhatYouKill.java:173)
    org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run (EatWhatYouKill.java:131)
    org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run (ReservedThreadExecutor.java:409)
    org.eclipse.jetty.util.thread.QueuedThreadPool.runJob (QueuedThreadPool.java:883)
    org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run (QueuedThreadPool.java:1034)
    java.lang.Thread.run (Thread.java:829)
Schaechtle commented 1 year ago

This entails

  1. removing this exception and ensuring that if target == constraint then logpdf is either 0 or -inf, e.g.
    (gpm/lopgdf model {:foo :1} {:foo 1})
    0

    and

    (gpm/lopgdf model {:foo :1} {:foo 0})
    -inf
  2. removing this exception and returning the constraint that overlaps with the target for each invocation of simulate, e.g.:
    (gpm/simulate model [:foo :bar] {:bar 17})
    {:foo 42 :bar 17}
  3. Adding tests for logpdf and simulate to ensure the correct thing is happening.
Schaechtle commented 1 year ago

Addendum: if targets and constraints in logpdf don't fully overlap, we add 0 and -inf where needed, e.g.

(gpm/lopgdf model {:bar 17 :foo :1} {:foo 0})
   -inf

or, for the the case when constraints and targets agree, we'd add 0 so that (gpm/lopgdf model {:bar 17 :foo :1} {:foo 1}) returns the same result as (gpm/lopgdf model {:bar 17} {})