Open zaneselvans opened 4 years ago
Hey @ezwelty did this end up getting integrated into the big PR #806?
Yes, if you mean all combinations of several columns, where all those columns are present. See "Harvest process" section of my top post in #806.
Resource.harvest_dfs()
harvests from multiple named input dataframes. Only dataframes with all the primary key fields are included. IfResource.harvest.harvest=True
, all such dataframes are harvested and, by default, also aggregated.
Note the "Only dataframes with all the primary key fields are included." So in your example, if you wanted to include combinations of report_date
and balancing_authority_id_eia
from a table without utility_id_eia
and state
, that would not work. Adding that functionality would be simple enough, by making Resource.format_df()
insert empty columns for the missing primary key columns in the case of a partial match.
In addition to being able to harvest a single value that's associated with a given entity permanently, or on a per-year basis, we also need to be able to harvest association tables -- all the observed combinations of several columns (e.g.
report_date
,balancing_authority_id_eia
,utility_id_eia
, andstate
).