OFDataCommittee / mlfoam

Thoughts about ML committee for OpenFOAM
GNU General Public License v3.0
34 stars 9 forks source link

Wider scope on data-driven techniques #1

Closed hmarschall closed 2 years ago

hmarschall commented 2 years ago

The current focus seems to be mostly around Machine-Learning. However, data-driven techniques encompass a broad portfolio. For coverage, maybe we can add a few of them and look out for committee members with expertise in these fields.

I am in particular thinking of data assimilation (see for instance OpenDA) or OpenFOAM-based reduced-order modeling, where big data and data-driven approaches play a crucial role.

AndreWeiner commented 2 years ago

Hi Holger, thanks for the suggestions. As far as I understand it based on the links you provided, data assimilation is basically data-driven time series forecasting. ITHACA employs classical POD for reduced-order modeling, so I would say both libraries fall into the more general ML category. I agree that the current technical roadmap is somewhat biased towards PyTorch and Deep Learning.

I would suggest that we could add an additional point to the goals section similar to:

What do you think? Best, Andre

hmarschall commented 2 years ago

Hi Andre,

it is completely ok to have the roadmap biased into the direction the active committee members will be oriented. However, pyTorch is also a third-party library. Instead, ITHACA-FV is based on OpenFOAM, for instance.

May I hence suggest that we list different data-driven approaches (machine-learning, data-assimilation, ...) as scope. On this basis, I think we should try to have users and developers engage with the committee, and so broaden the committee foci naturally as expertise in the different areas grows.

Best regards, Holger

tmaric commented 2 years ago

Hi Holger and Andre,

I disagree, and here's why. The fact that we are dealing with third-party libraries doesn't change the issue in the roadmap being actively worked on, or not. Of course the libraries are third-party, we're not going to implement our own Tensorflow or Pytorch in OpenFOAM. If there isn't an actual person volunteering to do the actual work on these different data-driven approaches, then we are bloating the roadmap with fake goals, aren't we? I'm thinking the other way around: advertise the committee, show what we are doing in OF Workshop trainings, and invite people that are willing to put in work hours to add new features to the roadmap, so it grows organically. I don't see the benefit of a goal in the roadmap that nobody is working on.

Best regards, Tomislav

hmarschall commented 2 years ago

Hi,

well, it's not about bloating any roadmap. It's about appreciating the fact that data-driven means more techniques than machine-learning. This should be reflected by the committee goals/scope, or wherever you want to state this. Additionally and independent from this, there is a roadmap.

Of course, everything relied on here are third-party software - and this is precisely why there would be an inconsistency when saying: "This is about machine-learning (where we use PyTorch) and we also promote third-party activities connected to OpenFOAM like OpenDA".

As I say, the fact that data-driven is not only ML (with PyTorch) should be reflected somewhere. Otherwise, this committee is about ML/DL only - then name it like this for the sake of clearness.

Best, Holger

tmaric commented 2 years ago

Hi Holger,

Sure, separating the roadmap as the list of issues currently worked on, from something like goals / scope works.

Best regards, Tomislav

AndreWeiner commented 2 years ago

Hi everybody, as suggested, I broadened the scope of the SIG and used some more generic terminology. Let me know what you think. Best, Andre

tmaric commented 2 years ago

Looks great :+1:

hmarschall commented 2 years ago

Just one note: I think formally this is a Technical Committee (TC).

AndreWeiner commented 2 years ago

We proposed the idea as TC, but in the last joint TC meeting, it was listed as SIG, so I stuck to that term. Anyhow, I think it's not all that important :-)

Can this issue be closed for now?