c3aidti / smoke

Gordon Group space
MIT License
0 stars 0 forks source link

Add new target for emulator inference called Cloud Liquid Water Path or CLWP #41

Closed vasanchez16 closed 4 months ago

vasanchez16 commented 7 months ago

Currently the SmokePPE functions are developed to use AOD (aerosol optical depth) as a target for the training of the ML models. For the next step, we will use CLWP (cloud liquid water path) as another target for the training of ML models.

CLWP and AOD are similar measurements in that they are 2D variables and have one variables for every latitude-longitude pair for each time stamp.

dadamsncsa commented 7 months ago

@vasanchez16 , Any thought on how to implement this? Does this just mean training to a different target, aka a new "technique"? Or does this require a different set of simulation data?

vasanchez16 commented 7 months ago

@dadamsncsa ,

This is going to use a different output variable from the smokeppe data we've already been working with and then train models to predict that data. Currently we are training models that predict AOD, these will be new models that predict cloud liquid water path (clwp). I believe I have a decent idea of how to implement this and we can discuss it tomorrow as I will be working on it today? We can meet today as well if you're free

vasanchez16 commented 7 months ago

Reminder: Necessary files need to be upserted similar to what is done in SppeSimulatonEnsembleOutputFile.c3typ Necessary files include:

Files need to be upserted, presumably, to SppeSimulationClwpEnsembleOutputFile.c3typ

dadamsncsa commented 7 months ago

@vasanchez16 I think these are just sppe files we previously didn't use. They are filtered out in the upsertFileTable.js method:

    var pathToFiles = containerRoot + "ens_" + String(this.simulationNumber) + "_glm_atmosphere";

Are these files part of the same SimulationModel as the other sppe vars? Also, Are these part of the same ensemble? That is have the same geo-spatial and time coverage?

If so, we can just update the upsertFileTable method to include them, then add those variables to SppeSimulationOutputParameters, then update upsertSimulationOutput method to parse them.

The result would be new fields in the SimulationOutput types. Perhaps we rename "sumAll" to be more descriptive of what it is actually summing as well.

dadamsncsa commented 7 months ago

Again, assuming these are really part of the same simulations as the previously loaded _glm_atmosphere vars, we won't need a new clwp_dataset. We can use the existing tatz datasets, just with additional variables added to it's SimulationOutput types.

vasanchez16 commented 7 months ago

@dadamsncsa

Yes these are part of the same ensemble and provide the same spatial and time coverage so you are most likely right that we can implement it the way you are suggesting. I'll look into this prior to our meeting today. Thanks!

vasanchez16 commented 4 months ago

Data is now assimilated and uploaded to the full grid as well as easy to upsert for now coarse grids.