NOAA-OWP / lstm

Other
12 stars 16 forks source link

Compatibility with heat model example from CSDMS #44

Open jmframe opened 6 months ago

jmframe commented 6 months ago

Short description explaining the high-level reason for the new issue.

Current behavior

The current implementation of the LSTM_BMI does not strictly follow the template from the CSDMS heat model example: https://github.com/csdms/bmi-example-python/tree/master/heat. Or some of the other BMI python models. For instance, in the Snow model example by @SnowHydrology, the model input values attributes of the model itself: https://github.com/SnowHydrology/snowBMI/blob/0886ac4140c4ede8300f92e42b039267ef2eaa40/snow/snow.py#L136, whereas in the LSTM model, they model inputs are passed in as arguments to the model forward function: https://github.com/NOAA-OWP/lstm/blob/bd0ccf3451ebd335c5b75e53442d9df9057d67c0/lstm/nextgen_cuda_lstm.py#L19.

Expected behavior

Is it expected that the LSTM model conform the the examples by the CSDMS? This will require some significant modification to the LSTM model implementation itself. There was, at some point, an expectation that BMI not require modifications to the specific model implementation, and that the BMI wrapper could be developed in a manner to require minimally, or none at all, changes to the model itself.

Steps to replicate behavior (include URLs)

  1. To conform to the Snow BMI example, the LSTM model would add in an _input_layer as an attribute to the LSTM class, such as this: self._input_layer = torch.tensor([0,0,...,0]), then, during the initialization of the LSTM_BMI, and during the set_value function, there would be an in-place operation to over-ride the input layer. This would allow the reference: val = self.get_value_ptr(var_name), which could then allow for the in-place operation through val[:] = src.reshape(val.shape).

Questions for OWP

  1. Is there a list of requirements for the BMI implementations of model formulations?
  2. Specifically, is it required that BMI implementations follow the templates from the examples set by CSDMS?
  3. Is there a hard requirement of code parsimony?
  4. Is there a continuing philosophy of domain experts recognizing the original model code?
  5. Are model formulation codes required to conform to the BMI templates, or is there room to allow BMI to conform to model formulation codes, provided that the BMI functions perform as intended?
hellkite500 commented 6 months ago
  1. Minimal requirements -- must implement the BMI functions specified by the Basic Model Interface. Though not useful, a compliant BMI formulation could simply return BMI_FAILURE and/or throw exceptions for each function. These won't actually work in the modeling framework, but they are valid, compliant codes that we just wouldn't use... 1.1 Additional requirements may exist for extended functionality, e.g. "calibration" and "parameter" setting, dynamic grid creation, ect, but these are optional semantics.

  2. It is NOT required, not should it be. The CSDMS examples are just that -- examples for implementing BMI based on models that BMI is interacting with that have a particular form/function that happen to fit that template well. There is not "one way" to implement an interface, which is why we treat it as an interface!

  3. Parsimony isn't a hard requirement, but is a design consideration that should be taken seriously for maintainability and readability of code being developed. If a problem can be solved simpler with the same results and not a a significant change in resource requirements (time to run, memory used, ect...), then the Occam's Razor principle should generally apply.

  4. Recognizing? Sure, well developed (and documented!!!) code should be familiar to an author who wrote some code initially. I don't think this absolves a model code from needing to consider refactoring and implementation details which generally make the code better quality, more maintainable, flexible for multiple use cases, ect. Personally, I think the montra of "don't touch domain code" is slowly fading to one of "work with domain code to make it better".

  5. BMI does not prescribe any implementation detail! Examples and templates are just convenience features which may or may not work for a given use case. BMI doesn't conform to anything, its a pretty static definition of functions, inputs, and returns. The BMI implementation details most definitely should conform to model formulations and the code that defines the formulations, and ideally, formulation codes can (and should, in some cases) be updated to conform to the functional requirements of BMI, such as being able to run a single time step update at a time vs an entire simulation duration at once, as well as being able to have input variables set by an external actor (the BMI implementation of set_value) and other such semantic needs of BMI.

jmframe commented 6 months ago

These answers read to me as somewhat contradictory to your answer to #43 https://github.com/NOAA-OWP/lstm/issues/43#issuecomment-2057872305. As far as I can tell, 1) the BMI functions were implemented, but 2) did not follow the CSDMS example close enough, so were modified to follow CDSMS more closely at the sake of working properly, and 3) the too many lines of code.

I understand that we want these functions to work efficiently, but commenting out working code and replacing it with non-functional sample code indicates to me that there are some specific guidelines that must be followed for OWP to accept the BMI implementation. Because what you've described here was met by previous versions of the LSTM BMI, but rejected.

hellkite500 commented 6 months ago

To be clear, I wasn't advocating that the existing implementation was good and/or correct. In fact it is demonstrably incorrect, but it satisfies minimum requirements.

What these answers should illustrate alongside that comment is that there is not a "one size fits all" solution and that adapting existing code to work with BMI, especially code that is harder to modify for whatever reason, means that the BMI implementation must be tailored to model being interfaced with implementation details customized to that solution.

That said, I do think there are ways to simplify BMI implementations, but these are not BMI specific nor requirements, they are simply developer tools. Being flexible and writing code that can adapt and/or take advantage of these things as they become available should be a consideration of BMI implementations.

I would also argue that these kinds of regressions are exactly what unit testing should help prevent, and will make migrations and maintenance of formulation BMI implementations a better overall experience.

SnowHydrology commented 6 months ago

Pretty much what @hellkite500 said. It looks like the updated code for set_value was a mistake that likely assumed it would work as the heat example does. There is no need to follow the CSDMS examples to the letter but my preference is to see BMI code that is relatively clean and parsimonious. If there are many lines of model-specific code in a BMI implementation, that tells me that something can be done differently (e.g., the use of modular helper functions instead).

For example, contrast this initialize code in CFE to the same function in Noah-OM.

Neither implementation is strictly wrong or correct and both are BMI compliant. However, the former performs model-specific tasks that I feel (i.e., my opinion) are better suited to the non-BMI source code or added as a helper function in bmi_cfe.c. I think (again, my opinion) that the cleaner the BMI, the more likely it is to be reusable across models and the more likely it is to be understood by domain scientist like me.

jmframe commented 6 months ago

@hellkite500 I just want to point out that the nonfunctional BMI code passes all unit tests, including the set_value unit test. I think that developing unit test that check for functionality as well as implementation is probably a good idea.

    try:
        this_set_value = -99.0
        bmi.set_value(var_name, np.array([this_set_value]))
        print ("  set value: " + str(this_set_value))

        if var_name_counter == 0: 
            pass_count += 1
    except:
        bmi_except('set_value()')

Result:

CONTROL FUNCTIONS
*****************
 updating...        timestep: 1.0
 updating until...  timestep: 100.0
 finalizing...

 Total BMI function PASS: 28
 Total BMI function FAIL: 0
madMatchstick commented 6 months ago

I think that developing unit test that check for functionality as well as implementation is probably a good idea.

Agreed - the "bmi unit test" was a first attempt to check for general compliance, across the board, for all BMI implemented models. Know that the python one you reference here was somewhat based off csdms heat examples test/.

2) Specifically, is it required that BMI implementations follow the templates from the examples set by CSDMS?

I can't help but tag a somewhat related CFE issue discussion regarding question 2 & beyond...

madMatchstick commented 6 months ago

@jmframe - Can you expand on "nonfunctional" when you say

nonfunctional BMI code passes all unit tests, including the set_value unit test

I am just diving back into this for the first time in a long minute. Thanks!