trilinos / Trilinos

Primary repository for the Trilinos Project
https://trilinos.org/
Other
1.21k stars 564 forks source link

Remove duplication and improve reuse of MiniTensor code in Trilinos #609

Closed bartlettroscoe closed 7 years ago

bartlettroscoe commented 8 years ago

CC: @amota, @trilinos/intrepid2

Description:

When Intrepid2 was created, it duplicated the MiniTensor code and created a maintenance problem.

This Story is to examine ways to eliminate the duplication of the MiniTensor code and improve its reuse in Trilinos and other projects.

Tasks:

1) Get copyright for MiniTensor [Done] 2) Create a new TriBITS Trilinos PT package called "MiniTensor" under packages/minitensor/ and get it working (but no other Trilinos packages use this yet) (see below) [Done] 3) Convert ROL to use new MiniTensor package [Done] 4) Convert Albany to use new MiniTensor package [Done] 5) Eliminate compiler warnings from MiniTensor package (needed for Sierra) [Done] 6) Remove version of MiniTensor in Intrepid2 package [Done] 7) Get Sierra to use new MiniTensor package [Done] 8) Remove version of MiniTensor in Intrepid [Done]

bartlettroscoe commented 8 years ago

Below is an email thread that I had with @amota about options for dealing with MiniTensor.

The next step will be to call a short meeting of interested Intrepid and Intrepid2 developers and stakeholders for decide how to deal with the duplication of MiniTensor (and perhaps other duplicated code?).

@amota, given that this is your code and you know these developers and stakeholders the best, could you call this meeting and add me to the attendee's list?


From: Mota, Alejandro Sent: Monday, September 12, 2016 5:24 PM To: Bartlett, Roscoe A Subject: Re: MiniTensor

Hi Ross,

Given that MiniTensor cannot go into Teuchos because it depends on Sacado, it seems that the most attractive solution would be to create an IntrepidMiniTensor subpackage to avoid code duplication, as you say.

I agree then that the intrepid/intrepid2 developers should be involved in the discussion, along with some others that care about this like Brent Perschbacher, Dave Littlewood and Jake Ostien.

Yes, feel free to open the issue on the Trilinos GitHub site.

Thanks,

Alejandro


From: Bartlett, Roscoe A Sent: Monday, September 12, 2016 5:09 PM To: Mota, Alejandro Subject: RE: MiniTensor

Alejandro,

Having MiniTenosr depending on Sacado would prevent it from being added to Teuchos since Sacado depends on Teuchos and we can't have circular dependencies. Could MiniTensor go into Scacdo? We would have to discuss that with the owner Eric Phipps. But I think that AD and tensors are different enough I would say that Sacado is would not be the right place to put MiniTensor. From a conceptual standpoint, MiniTensor is a subset of the scope for Intrepid and Intrepid2 so that seems like the right place if you are going to put it into an existing Trilinos package.

That takes us back to the Intrepid subpackage solution. I just realized a way to put MiniTensor under packages/intrepid/ without having to break Intrepid into subpackages and not creating what appears to most as a new top-level Trilinos package.

The idea is that we could just create a new top-level TriBITS package for MiniTensor called "IntrepidMiniTensor" located under the 'intrepid' source directory like:

  Trilinos/
    packages/
      intrepid/
        minitensor/
          src/
          test/

We would then list the new IntrepidMiniTensor library in the Trilinos/PackagesList.cmake file as:

    ...
    IntrepidMiniTensor  ...
    Intrepid ...
    Intrepid2 ...
    ...

and then the Intrepid, Intrepid2, and ROL Dependencies.cmake files would list "IntrepidMiniTensor" as an upstream package.

That would have all the correct behaviors w.r.t. to the checkin-test.py and CI testing processes where a change to any file under the directory Trilinos/packages/intrepid/minitensor/ would trigger the enable of the IntrepidMiniTensor Package as well as the Intrepid package (and all of their downstream dependent packages including Intrepid2, ROL, etc.). Having MiniTensor under the intrepid/ directory would make it look like we were not created a new top-level package.

But this is technically still created a new top-level Trilinos package and having "IntrepidMiniTensor" not be a real subpackage under "Intrepid" will likely confuse some people (but without harm I suspect).

To avoid those problems, we come back to the idea of officially breaking Intrepid into the subpackages "MiniTensor" and "Core" and do as we discussed over the phone. Or, more logically, combine Intrepid and Intrepid2 into a single top-level package "Intrepid" with the subpackages "IntrepidMiniTensor", "IntrepidClassic" (the old "Intrepid") and "Intrepid2" (i.e. subpackage "2" under "Intrepid").

However, the best solution from a clarity and dependency-management standpoint would be to make MiniTensor its own top-level package but that would require broader approval and DOE copyright assertion. However, the "IntrepidMiniTensor" top-level package solution I describe above requires the most minimal changes so is attractive.

[...]

It might be good to get the Intrepid and Intrepid2 developers and stakeholders together for a meeting for how to address this duplication.

Do mind if we create a Trilinos GitHub issue for "Remove duplication and improve reuse of MiniTensor code in Trilinos" and then track this conversation on their?

Cheers,

-Ross


From: Mota, Alejandro Sent: Monday, September 12, 2016 4:13 PM To: Bartlett, Roscoe A Subject: MiniTensor

Hi Ross,

Thanks for the info you provided at the telecon.

I just remembered that MiniTensor also depends on Sacado for automatic differentiation.

Not sure if that would prevent it from moving into Teuchos.

Alejandro

rppawlo commented 8 years ago

Alejandro,

Please add me to the meeting too.

Roger

bartlettroscoe commented 8 years ago

The decision was to try to make MiniTensor its own Trilinos/TriBITS package. See below email just sent out.

-----Original Message----- From: Mota, Alejandro Sent: Thursday, September 15, 2016 4:43 PM To: trilinos-developers@software.sandia.gov Cc: trilinos-framework@software.sandia.gov; Bartlett, Roscoe A; Perschbacher, Brent M; Perego, Mauro (-EXP); Willenbring, James M; Littlewood, David John; Ostien, Jakob T Subject: MiniTensor: request to make it a separate package

Hello all,

More than three and a half years have passed since I submitted a request for inclusion of MiniTensor in Trilinos, as shown below. In the end MiniTensor was included as part of Intrepid, and it is now part of both Intrepid and Intrepid2.

In that time MiniTensor and its user base have grown. Today I had a videoconference with Ross, Brent, Mauro, Dave Littlewood & Jake Ostien about the future of MinTensor. For the reasons described below, the consensus was that it would be a good idea to make MiniTensor into its own separate package. So I’m formally requesting this.

  • Proposed package name:

MiniTensor. This name is descriptive of the code’s purpose: manipulation and operations of small vectors and tensors.

  • Purpose of package:

Algebra and calculus of small vectors and tensors, most of the time of dimensions 2 and 3, sometimes up to 20 or 30. Common transformations, factorizations and geometric operations used in computational mechanics. Material modeling in solid mechanics. Solution and optimization of small linear and nonlinear systems for use in modeling of material models in solid mechanics. Optimized for ease of use (i.e. implementation of material models follows closely their description in papers/textbooks) and small tensors and vectors. Provide ability to run material models in GPU/many core architectures.

  • Whether to be included in an existing Trilinos package:

MiniTensor’s functionality does not exist in other packages, but it was originally included as part of Intrepid. This was done to avoid creating a new package. With the creation of Intrepid2, however, there are now two versions of MiniTensor that are essentially the same. This is unnecessary code duplication. Thus MiniTensor is now part of Intrepid/Intrepid2, but it does not depend on them. Likewise, Intrepid/Intrepid2 do not depend on MiniTensor. They are independent from one another.

The main codes that use MiniTensor are Albany and Sierra/Solid Mechanics. One of the original reasons to move MiniTensor from Albany to Trilinos was to make it available for use in both Albany and Sierra/SM. When Intrepid2 was created, Albany started using it right away; but Sierra still uses Intrepid. Thus at the moment it is necessary to keep two versions of MiniTensor that have the same functionality.

Although currently MiniTensor is small (~600K), we think that it will grow in the near future, in particular as we have started interfacing it with ROL to have access to more sophisticated optimization algorithms for use in very complex material models.

  • Required and optional Trilinos packages:

Required: Teuchos, Sacado, Kokkos. Optional: none.

  • Required and optional TPLs:

Required: boost. Optional: none.

  • Stable or experimental code.

Stable as it is used in both research (Albany) and production (Sierra/SM) codes.

  • Subpackages:

None.

  • Development support and maintenance plan:
    • Package maintainer: Alejandro Mota, primary maintainer.
    • Release: The code is already released as part of Intrepid/Intrepid2.
    • External release, long-term maintainer: Alejandro Mota. Our solid mechanics research relies on using this library, thus as long as this research is supported in either Albany or Sierra, we will support the library.

Thanks,

Alejandro


On 2013/01/09, 18:46 , "Alejandro Mota" amota@sandia.gov wrote:

Hello all,

We propose a new Trilinos package as follows, according to the procedure
outlined in

https://software.sandia.gov/trilinos/developer/policies/adding_new_package.h tml

- Proposed package name:

MikroTanustes, or in the original Greek μικρό τανυστές, which means
small tensors in English.

- Purpose of package:

Algebra and calculus of small vectors and tensors, most of the time of
dimensions 2 and 3, and definitely less than dimension 8. Common
transformations, factorizations and geometric operations used in
computational mechanics. Material modeling in solid mechanics. Optimized
for ease of use and small tensors and vectors (á la TVMet or Blitz++
small matrices).

- Whether to be included in an existing Trilinos package:

We don't see this functionality somewhere else in Trilinos. We have been
made aware of the efforts by Carter Edwards and Kokkos Arrays. There may
be some overlap there but our aim is tiny tensors and vectors that are
used for local operations.

- Required and optional Trilinos packages:

We only require Sacado for automatic differentiation. No other
requirements or options.

- Required and optional TPLs:

None other than the standard C++ libraries.

- Stable or Experimental code:

We consider it stable, and one reason we would like to see it in
Trilinos is for use in Sierra.

- Subpackages:

None

- Development support and maintenance plan:

   * Package maintainer: myself (Alejandro Mota), primary maintainer.

   * Release: code is ready for release. It is right now part of Albany
(https://software.sandia.gov/albany/) which uses Trilinos extensively.
Incidentally, the library has gone through copyright review and been
released under a BSD-like license as part of Albany.

   * External release, long-term maintainer: Alejandro Mota. Our
research relies on using this library, so as long as this research is
supported, we will support the library.

Thanks,

Alejandro
lxmota commented 8 years ago

Today there was a telecon about MiniTensor. In attendance Ross Bartlett, Mauro Perego, Mike Heroux, Dave Littlewood, Brent Perschbacher, and I. Here are the conclusions from that meeting:

So I'm closing this issue and then there will be a new one once the copyright moves forward, which will take a while.

bartlettroscoe commented 7 years ago

Putting back in progress since this has the right scope and background already.

bartlettroscoe commented 7 years ago

@lxmota,

As we discussed, please:

Then I can pull the branch and give it a try and help fix issues.

lxmota commented 7 years ago

@bartlettroscoe pushed initial sources to minitensor-pkg-609 branch.

bartlettroscoe commented 7 years ago

pushed initial sources to minitensor-pkg-609 branch.

@lxmota, thanks. I will have a look and get back to you.

bartlettroscoe commented 7 years ago

@lxmota, I pulled the branch minitensor-pkg-609 and it seems like there is some confusion. You are only trying to create a new TriBITS packages under the Trilinos TriBITS Repo as described here:

If you look at:

you will see that a TriBITS package does not have a PackagesList.cmake, TPLsList.cmake, or ProjectName.cmake file. It is only if you are building MiniTensor as its own TriBITS CMake project that you would define those things. But since MiniTensor has required dependencies on several Trilinos packages, that is not really a viable approach for any use case (at least not until we can do TriBITSPub/TriBITS#63).

I will go ahead and try to fix this up and if I run into problems, I will contact you. At the very least, I will push commits to the minitensor-pkg-609 branch branch that will have all of the basic TriBITS infrastructure issues resolved.

bartlettroscoe commented 7 years ago

@lxmota,

If fixed up MiniTensor on the minitensor-pkg-609 branch to build as its own PT package under Trilinos. I pushed the commits to the branch on this repo. I tested this locally on my machine crf450 with the checkin-test-sems.sh script with:

$ cd CHECKIN/
$  ./checkin-test-sems.sh --enable-packages=MiniTensor --local-do-all

and this passed showing:

  Configure: Passed (0.14 min)
  Build: Passed (0.39 min)
  Test: Passed (0.00 min)

  100% tests passed, 0 tests failed out of 2

  Label Time Summary:
  MiniTensor    =   0.12 sec (2 tests)

  Total Test time (real) =   0.07 sec

A couple of comments/suggestions:

In order to create a linkage with this GitHub issue #609, you have to put #609 with the hash # in the commit log or it will not be linked in GitHub. If you look above in this issue, you can see how my new commits with #609 are linked. You can fix your older existing commits by doing this:

$ cd Trilinos/
$ git fetch origin
$ git checkout --track origin/minitensor-pkg-609
$ git rebase -i origin/develop

Then you can "edit" the your earlier commits on the branch to change, for example:

  609: Update comments to remove references to Intrepid

to

  Update comments to remove references to Intrepid (#609)

(if you are not comfortable with that, I can do this for you.)

Another thing I noticed is that MiniTensor generates a lot of warnings (even with GCC 4.7.2) with three unique warnings:

[rabartl@crf450 MPI_RELEASE_DEBUG_SHARED_PT (master)]$ grep warning: make.out | sort | uniq
/home/rabartl/Trilinos.base/Trilinos/packages/minitensor/src/MiniTensor_LinearAlgebra.t.h:912:3: warning: unused variable ‘cosine’ [-Wunused-variable]
/home/rabartl/Trilinos.base/Trilinos/packages/minitensor/src/MiniTensor_Storage.h:266:22: warning: array subscript is above array bounds [-Warray-bounds]
/home/rabartl/Trilinos.base/Trilinos/packages/minitensor/src/MiniTensor_Storage.h:273:22: warning: array subscript is above array bounds [-Warray-bounds]

I would be good to get these cleaned up since they clutter up the build output. Also, I think that Sierra is still using -Werror to elevate warnings to errors so it would be a good idea to get these cleaned up sooner rather than later if you want Sierra to use this version of MiniTensor.

Other than that, I would say this is ready to push to the 'develop' branch. To get this onto your local 'develop' branch on your local machine, just do:

$ cd Trilinos/
$ git checkout develop
$ git merge minitensor-pkg-609
$ git pull --rebase

That will give a nice clean linear history (no need for this to be on a topic branch I don't think).

Then, if you don't mind, could you try following:

and pushing this from my machine crf450 to be extra safe to avoid breaking the new CI build?

Then after that pushes, when you pull the updated 'develop' branch on your local machine, just remember to use git pull --rebase (see here) to keep a nice linear history and avoid duplicate commits.

Please let me know if you have any questions about any of this or any of the commits I made.

Thanks,

-Ross

lxmota commented 7 years ago

@bartlettroscoe

I'll fix the warnings. The complaint about the array bounds is a compiler bug, as I have dealt with it before and found it on gcc's bugzilla. I need to look for a workaround to eliminate the warning.

I'll go ahead and push from your machine.

lxmota commented 7 years ago

@bartlettroscoe I merged the minitensor-pkg-609 branch into develop and tested and pushed from crf450.

I wanted to start to fix the warnings, but when I compile in my local machine using checkin-test.py, it fails to configure because it can't find ParMETIS, which is not required by MiniTensor (or TeuchosCore, KokkosCore or Sacado, I think).

Not sure why ParMETIS is required, or how to disable it. This doesn't happen on crf450 (maybe because ParMETIS is available there?).

bartlettroscoe commented 7 years ago

I merged the minitensor-pkg-609 branch into develop and tested and pushed from crf450.

Confirmed. MiniTenor is showing up as its own Trilinos package in the CI build:

I wanted to start to fix the warnings, but when I compile in my local machine using checkin-test.py, it fails to configure because it can't find ParMETIS, which is not required by MiniTensor (or TeuchosCore, KokkosCore or Sacado, I think).

ParMETIS is one of the TPLs that must be present for the new Trilinos checkin-test.py default builds. That is what #410 and #482 are about. Even if fully testing MiniTensor currently does not require ParMETIS, it will once it gets used in ROL because ROL indirectly enables Zoltan which optionally depends on ParMETIS. See:

which shows:

...
-- Setting Trilinos_ENABLE_MueLu=ON because ROL has an optional dependence on MueLu
...
-- Setting Trilinos_ENABLE_Zoltan=ON because MueLu has an optional dependence on Zoltan
...
-- Setting Zoltan_ENABLE_ParMETIS=ON since TPL_ENABLE_ParMETIS=ON
...

You have to explicitly enable optional TPLs or they will not get enabled. That is why ParMETIS and the other new set of Primary Tested (PT) TPLs get default enabled for the new PT CI build (see #410).

Not sure why ParMETIS is required, or how to disable it. This doesn't happen on crf450 (maybe because ParMETIS is available there?).

You can't disable ParMETIS for the --default-builds. The fact that you configure fails says that your env can't sufficiently test Trilinos for pushes. But on a machine with the SEMS env properly loaded, it is found. Therefore, it is good that the configure failed.

However, if you still want to use the chekcin-test.py script locally for your own testing and development work, you can define any --extra-builds you want and turn off the --default-builds with the checkin-test.py script. For example, you can write a checkin-test.py wrapper script like:

$ cat checkin-test-xlmota.sh
#!/bin/bash

echo "
-DTPL_ENABLE_MPI=ON
-DCMAKE_BUILD_TYPE=RELEASE
-DTrilinos_ENABLE_DEBUG=ON
-DBUILD_SHARED_LIBS=ON
-DTrilinos_ENABLE_COMPLEX=OFF
" > MPI_RELEASE_DEBUG_SHARED_ST.config

$TRILINOS_DIR/checkin-test.py \
  --default-builds= \
  --extra-builds=MPI_RELEASE_DEBUG_SHARED_ST \
  "$@"

This way, you can skip the --default-builds that require ParMETIS (or any other TPL) and define any builds you want to enable or disable anything you want. For local testing you can do:

./checkin-test-xlmota.sh --enable-packages=MiniTensor --local-do-all

But don't ever use this type of script to push to Trilinos. Only use the checkin-test-sems.sh script on a RHEL 6 machine with the SEMS env to test and push to Trilinos. Otherwise, you could break the standard PT CI build and that is bad.

Make sense? Would it help to start writing an FAQ for the checkin-test.py and checkin-test-sems.sh scripts for Trilinos to answers questions like these?

lxmota commented 7 years ago

Ok, thanks for the explanation.

Perhaps the easiest solution is to compile and install ParMETIS in our own local machines, so I can also test locally. Still, I promise that I'll only push from crf450.

And yes, I believe a FAQ for questions like these would be helpful, as I'm sure there will be other libraries that are needed that would seem like unexpected requirements for some new Trilinos developers.

lxmota commented 7 years ago

@bartlettroscoe I'm trying to convert a version of Albany to MiniTensor. There is a file MiniTensor_config.h that gets generated in the build process and placed in the Trilinos build directory under packages/minitensor/src. But that header does not get copied to the Trilinos install directory, so the Albany build fails. I compared with Intrepid2 and found a corresponding Intrepid2_config.h, but theirs gets copied to the install directory. I have compared CMakeList.txt files and I can't figure out what command or instruction to use to copy the MiniTensor_config.h header to the install directory.

bartlettroscoe commented 7 years ago

There is a file MiniTensor_config.h that gets generated in the build process and placed in the Trilinos build directory under packages/minitensor/src. But that header does not get copied to the Trilinos install directory, so the Albany build fails.

@lxmota,

My bad. You need to add that header file ${CMAKE_CURRENT_BINARY_DIR}/${PACKAGE_NAME}_config.h to your HEADERS variable. For an example, see:

https://github.com/trilinos/Trilinos/blob/master/packages/epetra/src/CMakeLists.txt#L39

That will result in that file getting installed.

Let me know if you have problems with this.

lxmota commented 7 years ago

@bartlettroscoe Thanks, trying it as we speak.

lxmota commented 7 years ago

Tasks 7 & 8 have been completed: 7) Port Sierra/SM to MiniTensor 8) Remove old Intrepid_MiniTensor headers from Intrepid