idaholab / blackbear

BlackBear is a MOOSE-based code for simulating degradation processes in concrete and other structural materials.
GNU Lesser General Public License v2.1
17 stars 42 forks source link

Integrate NEML2 into Blackbear #334

Closed hugary1995 closed 11 months ago

hugary1995 commented 1 year ago

This PR adds

  1. NEML2 as a submodule
  2. a neml2.mk for make rules
  3. a test userobject to perform batched update (which calls NEML2)
  4. a test material object as the glue material to retrieve the result

We still need to set up the CIVET recipe to

  1. download libtorch
  2. configure MOOSE with libtorch
  3. compile Blackbear with NEML2 and libtorch

close #333

hugary1995 commented 1 year ago

I have two moose PRs which you can tell I am trying to fix some issues for NEML2. https://github.com/idaholab/moose/pull/23488 https://github.com/idaholab/moose/pull/23472

Once they are merged, we will be able to make a fair comparison among

The naive linear elasticity model I included in this PR has a similar performance compared to a MOOSE native material model.

hugary1995 commented 1 year ago

Hold on! This is not ready for review yet. I just got all the pieces together (from NEML2 and MOOSE) and am working on this PR right now.

hugary1995 commented 1 year ago

This is what happens when we set verbose = true :1st_place_medal: image

hugary1995 commented 1 year ago

@bwspenc @dschwen @reverendbedford This PR is ready for review. I am very excited about this new capability. The tests are passing on my local machine. The Precheck on CIVET failed due "Host key verification failed" which I am not sure if I can fix. doc/content/getting_started/NEML2.md is a good place to start the review.

@dschwen You may want to do a sanity check for all the template magic in NEML2Utils.h, there are some interesting recursive variadic templates as well.

@lynnmunday @dewenyushu @recuero You may also be interested in reviewing this PR, as this is how I will integrate NEML2 into any MOOSE-based app. I can make a PR for isopod if you'd like. Getting the parameter gradients (in the context of inverse optimization) will be a two-line addition, as we already have that API ready in NEML2.

hugary1995 commented 1 year ago

test/tests/neml2/models contains all the NEML2 material models.

hugary1995 commented 1 year ago

We need to either modify the existing recipe for Test or add a new CIVET check recipe to test NEML2. I assume the GPU boxes are not ready yet @loganharbour

hugary1995 commented 1 year ago

I managed to get the Precheck to pass :)

All the other checks fail to build BlackBear, because the NEML2 submodule is always initialized in the recipe, the neml2.mk rule correctly detects that and attempts to build BlackBear with NEML2. However, MOOSE isn't configured with --with-libtorch in any of the recipe, and so #include <torch/torch.h> would fail.

moosebuild commented 1 year ago

Job Test on 7ffea8c wanted to post the following:

View the site here

This comment will be updated on new commits.

hugary1995 commented 1 year ago

Okay, the tests are good now. Test fails to build the docs because we have not enabled NEML2 yet...

loganharbour commented 1 year ago

Let me know when we want to change the civet recipe around.

hugary1995 commented 1 year ago

I think the only recipe that needs to be modified is the Test. Basically, we need to compile BlackBear with a moose that's configured with libTorch, run the tests, build docs, and test dbg.

loganharbour commented 1 year ago

I think the only recipe that needs to be modified is the Test. Basically, we need to compile BlackBear with a moose that's configured with libTorch, run the tests, build docs, and test dbg.

Sounds good. I will plan on this, then:

Are you on board @bwspenc?

hugary1995 commented 1 year ago

idk if we default to clean the container in a new recipe. "Documentation" could be run immediately after "Test" to save a recompile. But if we can start a check from where we left it off, then nevermind me.

loganharbour commented 1 year ago

We prioritize duplicate compiles in order to support parallel jobs. In most circumstances, we have clients available. Duplicate 4 min of compiling to save 20 min.

hugary1995 commented 1 year ago

That makes sense.

moosebuild commented 1 year ago

Job Documentation on 92e495e wanted to post the following:

View the site here

This comment will be updated on new commits.

moosebuild commented 1 year ago

Job Coverage on 92e495e wanted to post the following:

Coverage

c59398 #334 92e495
Total Total +/- New
Rate 94.80% 94.05% -0.75% 91.34%
Hits 1768 2069 +301 306
Misses 97 131 +34 29

Diff coverage report

Full coverage report

This comment will be updated on new commits.

hugary1995 commented 1 year ago

I proof read your commit! I can't comment on the code, the templating is way over my head.

Thanks :)

loganharbour commented 1 year ago

Just reviewed - nice to see the batch system in action. Recursive variadic templates are reasonable enough to follow. Glad to see coverage works on the neml components.

moosebuild commented 1 year ago

Job Precheck on 1439c23 wanted to post the following:

Your code requires style changes.

A patch was auto generated and copied here
You can directly apply the patch by running, in the top level of your repository:

curl -s https://mooseframework.inl.gov/blackbear/docs/PRs/334/clang_format/style.patch | git apply -v

Alternatively, with your repository up to date and in the top level of your repository:

git clang-format e52d1be61c228ae299e01af383908f9eea35d8e2

lynnmunday commented 1 year ago

@hugary1995 I tried to push some changes to your branch that @dschwen helped me make to get the neml2 interface to build and I messed up the moose submodule this commit is using. I need to ask for help tomorrow to fix this.

hugary1995 commented 1 year ago

I dropped the unwanted commits and rebased on the current devel branch, assuming this is what you intended to do.

hugary1995 commented 1 year ago

The failed tests should pass once moose uses a newer libtorch. I'm somewhat reluctant to extend support for libtorch 1.10 at this moment. Alternatively, I could add a script like update_and_rebuild_neml2.sh which opts to download a newer libtorch, but that could potentially break stochastic_tools. I'm open to suggestions.

bwspenc commented 12 months ago

@hugary1995 From what I understand from talking with others, we need to build against libtorch 2.1 for this to work on Mac. It sounds like that might not be very hard for us to do. That seems unrelated to the test failures you're seeing on Linux, though.

hugary1995 commented 12 months ago

It fails to build because torch::linalg_cross doesn't exist in the current moose default libtorch, and so this build error is not specific to any operating system. I can of course work around this build error, but even if it builds there are other actual bugs that I have to work around.

If we can bump up the moose libtorch version (in the near term), there's not much value for me to fix them at this point.

dschwen commented 12 months ago

I have built moose with libtorch 2.1 (and cuda 12.1 / 12.3) (on Linux) and it seems to work just fine.

hugary1995 commented 12 months ago

I have built moose with libtorch 2.1 (and cuda 12.1 / 12.3) (on Linux) and it seems to work just fine.

Are we ready to bump it up then? :)

loganharbour commented 12 months ago

I have built moose with libtorch 2.1 (and cuda 12.1 / 12.3) (on Linux) and it seems to work just fine.

on sawtooth? sawtooth's glibc is too low

bwspenc commented 11 months ago

@loganharbour I was talking with @grmnptr yesterday, and he said that his understanding was that the OS update on Sawtooth that is currently underway will resolve the glibc version issue there. Is that right?

grmnptr commented 11 months ago

The containers we use for testing should already have the good GLIBC version. I am planning to submit the libtorch version bump PR today.

grmnptr commented 11 months ago

At the moment you can only use libtorch with moose on sawtooth with the containers. That will be solved soon by the sawtooth update.

grmnptr commented 11 months ago

Merged @lynnmunday 's update to moose.

hugary1995 commented 11 months ago

Fantastic! I'll get this PR going.

hugary1995 commented 11 months ago

All green now.

hugary1995 commented 11 months ago

As another future task, we need

  1. A civet recipe to download, configure, and test libtorch with cuda. We are only testing libtorch with cpu-only.
  2. Some mechanism to detect available compute devices in the test specs. For example, some regression tests should run only when a certain type of device (e.g. cuda) is available.
loganharbour commented 11 months ago

This broke MOOSE: https://civet.inl.gov/job/1853788/

loganharbour commented 11 months ago

Looks like the sanity checking for neml2 w/o torch is poor?