edgi-govdata-archiving / ECHO_modules

ECHO_modules is a Python package for analyzing a copy of the US Environmental Protection Agency's (EPA) Enforcement and Compliance History Online (ECHO) database
GNU General Public License v3.0
3 stars 6 forks source link

Get top violators for SDWA, GHG, TRI programs #75

Open shansen5 opened 5 months ago

shansen5 commented 5 months ago

In utilities.py, enhance get_top_violators() and chart_top_violators() to handle SDWA, GHG, TRI programs.

SDWA, TRI, GHG do not use the same identifiers of violations that CAA, CWA, RCRA do. Possible fields to use: SDWA - SDWA_COMPLIANCE_STATUS (Serious violator, Violation identified, Unknown, No Violation Identified - can be mapped into the same results as CAA, etc.) TRI - TRI_RELEASES_TRANSFERS (Total pounds per year released) GHG - GHG_CO2_RELEASES (Total facility emissions in metric tons CO2e from the most recent reporting year)

ericnost commented 1 month ago

I think this is done (except SDWA). See cross-programs:

    df_violators['GHG'] = get_tri_ghg_violators(df_active, 'GHG_CO2_RELEASES',  20)
    display(chart_tri_ghg_violators(df_violators['GHG'], field='GHG_CO2_RELEASES',
            title='Greenhouse Gas Emissions in Metric Tons (most recent year)',
            xlabel='Emissions in Metric Tons'))

    df_violators['TRI'] = get_tri_ghg_violators(df_active, 'TRI_RELEASES_TRANSFERS', 20)
    display(chart_tri_ghg_violators(df_violators['TRI'], field='TRI_RELEASES_TRANSFERS',
            title='Total pounds per year released for Air Emissions, Surface Water Discharges, Underground Injections, Releases to Land and Off-Site Transfers',
            xlabel='Total pounds per year'))