rl-institut / oemof-B3

An open-source energy system model for Brandenburg/Berlin.
https://oemof-b3.readthedocs.io/
GNU Affero General Public License v3.0
9 stars 5 forks source link

Features/set name attr #281

Closed MaGering closed 1 year ago

MaGering commented 1 year ago

Resolves #137

With this PR the attribute name is checked within the script data_processing.py.

With the implemented function it is ensured that the name is

  1. empty if region is 'ALL'
  2. set (according to convention) where name is empty and region is fixed
  3. checked for all values that are not None and where region is fixed
jnnr commented 1 year ago

Hi @MaGering, I had something like this in mind (can be adapted, just a sketch and did not test it):

def get_name(region, carrier, tech):
    r"""
    This is oemof-b3's-naming convention.    
    """
    return f"{region}-{carrier}-{tech}

def get_name_for_df(df):
    # TODO: could assert that df has region carrier tech as columns.
    return df.apply(lambda x: get_name(x["region"], x["carrier"], x["tech"]), 1)

def check_name(df):
    name_generated = get_name_for_df(df)
    name_as_given = df["name"]

    diff = compare_scalar_data(name_given, name_generated)

    if diff:
        raise Warning(f"There is a diff {diff}")

def set_name(df):
    try:
        check_name_df(df)
    except Warning as w:
        raise Warning("Name will be adapted"}   

    name_generated = get_name_for_df(df)
    df.name = name_generated

    return df
MaGering commented 1 year ago

Hi @MaGering, I had something like this in mind (can be adapted, just a sketch and did not test it):

def get_name(region, carrier, tech):
    r"""
    This is oemof-b3's-naming convention.    
    """
    return f"{region}-{carrier}-{tech}

def get_name_for_df(df):
    # TODO: could assert that df has region carrier tech as columns.
    return df.apply(lambda x: get_name(x["region"], x["carrier"], x["tech"]), 1)

def check_name(df):
    name_generated = get_name_for_df(df)
    name_as_given = df["name"]

    diff = compare_scalar_data(name_given, name_generated)

    if diff:
        raise Warning(f"There is a diff {diff}")

def set_name(df):
    try:
        check_name_df(df)
    except Warning as w:
        raise Warning("Name will be adapted"}   

    name_generated = get_name_for_df(df)
    df.name = name_generated

    return df

Thanks for these suggestions. They were really helpful! Done with commit https://github.com/rl-institut/oemof-B3/pull/281/commits/595be9571ebc96b80f14a392cb66e298a7cfa033.

MaGering commented 1 year ago

Ready for your review @jnnr!

MaGering commented 1 year ago

Tested and got

* SettingWithCopyWarning

Commit https://github.com/rl-institut/oemof-B3/pull/281/commits/46a2be03594bd93480ac41e83ee99abba7894670 did not solve the problem as suggested in the console. Exactly the same warning appeared in the pandas datapackage instead of in the data_processing.py script.

I kept commit https://github.com/rl-institut/oemof-B3/pull/281/commits/46a2be03594bd93480ac41e83ee99abba7894670 anyway, as it is preferable to the previous approach of overwriting.

I could finally solve the problem using a copy of the Dataframe df. With the copy I am able to overwrite the values without the warning. Afterwards I overwrite the original df with the copy (also without a warning). See commit https://github.com/rl-institut/oemof-B3/pull/281/commits/8d083d78c702c59c4db1b6ae129577c6e6b5daf7. I tested this and compared the outcome with meld.

jnnr commented 1 year ago

Tested and got

  • A difference in additional scalars. The new functions introduced the name TOTAL-emission-constraint in three rows.

We found out that this has no influence on the results as the name is not processed for the emission constraints. I opened an issue that reminds us of that: #298