Closed nschloe closed 8 years ago
Hey @nschloe , thanks for reporting this! btw does this code actually use Teuchos_RCPStdSharedPtrConversions.hpp?
Yup, but not in this small example. :) I removed the superfluous #include
s.
@nschloe Is this a debug or release build? Is Teuchos_ENABLE_DEBUG ON?
For the sake of completeness, here's my CMakeLists.txt
:
CMAKE_MINIMUM_REQUIRED(VERSION 2.8.8)
PROJECT(Nosh CXX)
FIND_PACKAGE(Trilinos REQUIRED COMPONENTS Stratimikos Thyra Tpetra Ifpack2)
INCLUDE_DIRECTORIES(
SYSTEM
${Trilinos_INCLUDE_DIRS}
${Trilinos_TPL_INCLUDE_DIRS}
)
SET(MY_EXECUTABLE ifpack2test)
ADD_EXECUTABLE(${MY_EXECUTABLE} main.cpp)
TARGET_LINK_LIBRARIES(
${MY_EXECUTABLE}
${Trilinos_LIBRARIES}
)
set_property(TARGET ${MY_EXECUTABLE} PROPERTY CXX_STANDARD 11)
It's configured with
cmake \
-DCMAKE_BUILD_TYPE:STRING=Debug \
-DCMAKE_CXX_COMPILER:STRING=mpicxx \
../source/
It may help to set Teuchos_ENABLE_DEBUG:BOOL=ON
, for e.g., additional RCP checks.
This is for the Trilinos build, right?
This is for the Trilinos build, right?
Yes :)
Tough, I'm getting internal compiler errors when enabling that option; see, e.g., here.
Can you reproduce the segfault with the above MWE?
Hi @nschloe -- it builds for me. My recent commit added a test, stratimikos/test/test_issue_535.cpp. The test fails in both MPI_DEBUG and SERIAL_RELEASE builds. It looks innocuous to me, but I'm not a Stratimikos or Thyra developer, so I'm not sure if you have to call some registration function first before you're allowed to create Ifpack2 things. @bartlettroscoe , could you comment? Thanks!
I'm looking at ifpack2/adapters/thyra/Thyra_Ifpack2PreconditionerFactorydef.hpp, line 163. It turns out that constParamList (and therefore paramList) is null at that point. Can't get pineapple juice from a turnip; can't get a string parameter from a null ParameterList. Let me compare against the Ifpack(1) factory to see what's going on. My guess is that this class never gets used in this way -- @bartlettroscoe , is @nschloe 's use case typical?
@bartlettroscoe , when would the Ifpack2PreconditionerFactory's setParameterList method normally get called? As far as I can tell, that's the only place in which paramList_ could possibly get set. My printf debugging indicates that this method never gets called.
I tried something: After line 163 of ifpack2/adapters/thyra/Thyra_Ifpack2PreconditionerFactorydef.hpp, if paramList is null, I set constParamList to the result of getValidParameters() instead. That makes the test pass.
I'm not committing this change until I hear back from @bartlettroscoe to make sure that this won't break something else.
@mhoemmen and @nschloe,
I had never looked at this code before. It looks like the initial version was contributed by:
01d1b7a "Ifpack2: Added Thyra adapter code"
Author: Julien Cortial <jcortia@sandia.gov>
Date: Wed May 29 14:58:10 2013 -0700 (3 years, 2 months ago)
A packages/ifpack2/adapters/thyra/Thyra_Ifpack2PreconditionerFactory.cpp
A packages/ifpack2/adapters/thyra/Thyra_Ifpack2PreconditionerFactory_decl.hpp
A packages/ifpack2/adapters/thyra/Thyra_Ifpack2PreconditionerFactory_def.hpp
I will look at this a little and then comment back.
But note that in general you never want to commit a test that you know is broken. This breaks the CI build (stopping people from pushing) and just creates a lot of red on the dashboard (which teaches people to just ignore failures). We will discuss this at the next Trilinos Leaders Meeting.
I tried something: After line 163 of ifpack2/adapters/thyra/Thyra_Ifpack2PreconditionerFactorydef.hpp, if paramList is null, I set constParamList to the result of getValidParameters() instead. That makes the test pass.
The Teuchos::ParameterListAcceptor interface says that the defaults for the parameters being set should be put into the non-const PL. Why was this not followed for this factory like for the others? Was there a reason for this or just an oversight?
Also, if the PL is not set, then it should not be read. The factory object (or any object) that is not given a PL should still do something logical even if no PL gets set. Does that make sense?
Otherwise, is there a reason why the Ifpack2 factory was not registered the Stratimikos::DefaultLinearSolverBuilder like Ifpack? That would make it show up automatically.
I have my other questions as I start to look over the Ifpack2 code but I don't think this issue ticket is the right place for those.
The Teuchos::ParameterListAcceptor interface says that the defaults for the parameters being set should be put into the non-const PL. Why was this not followed for this factory like for the others? Was there a reason for this or just an oversight?
git blame
says "Julien Cortial" wrote this class. I made a tiny change in 2013 and another tiny change in 2015. Those changes are not relevant to this issue.
The class looks a lot like the Ifpack(1) factory, so my guess is that it was just a copy and paste. Does the Ifpack(1) factory do the right thing? (It lives in stratimikos/adapters.)
"Julien Cortial" wrote this class
I don't think I ever met Julien. Is he still around SNL?
Does the Ifpack(1) factory do the right thing? (It lives in stratimikos/adapters.)
Yes (but there are no automated tests to prove that). It does:
if(paramList_.get()) {
Teuchos::ParameterList
&ifpackSettingsPL = paramList_->sublist(IfpackSettings_name);
// Above will create new sublist if it does not exist!
TEUCHOS_TEST_FOR_EXCEPT(0!=ifpack_precOp->SetParameters(ifpackSettingsPL));
// Above, I have not idea how any error messages for a mistake will be
// reported back to the user!
From there, it is up to the Ifpack code itself to deal with PLs correctly.
I guess for now, since the Ifpack2PreconditionerFactory reads a const list, the change you suggest [above]() would be fine for now. I will create new issue tickets for a) addressing issues with the Teuchos::ParameterListAcceptor documentation and b) integrating Ifpack2PreconditionerFactory into Stratimikos proper.
@bartlettroscoe if (paramList_.get ())
just means it checks whether the current ParameterList is null. Should it be up to the specific Factory whether creating a solver without parameters makes sense?
Anyway, I'll make the suggested change. Thanks for looking at this!
Should it be up to the specific Factory whether creating a solver without parameters makes sense?
That has to be specified in the abstract interface and then all of the implementations need to follow that. The assumed behavior for the Teuchos::ParameterListAcceptor interface should be that a PL is not required in order for the object to do something valid. We obviously need to write down those specs and then upgrade the underlying subclasses to follow. But this might break backward compatibility with non-complaint implementations so now we have a mess.
I just pushed the proposed fix. I'll close this issue for now but everybody, please feel free to continue discussion. @bartlettroscoe will open other issues as needed.
The assumed behavior for the Teuchos::ParameterListAcceptor interface should be that a PL is not required in order for the object to do something valid. We obviously need to write down those specs and then upgrade the underlying subclasses to follow. But this might break backward compatibility with non-complaint implementations so now we have a mess.
I agree that this is a dilemma. On the one hand, parameters are "options" and therefore "optional." On the other hand, what's a "default preconditioner"? Users might forget to tell us which preconditioner they want, and it makes sense for us to catch that early. Otherwise, they get unexpected results and we have to waste time digging through their input deck.
Thanks everyone for investigating!
@nschloe,
Thanks everyone for investigating!
I think the bottom line from this discussion is that almost every realistic use case will require that a PL be registered with a factory object. We need to pin down the expected behavior better.
Also, I think that if you turn on debug-mode checking (e.g. -DTrilinos_ENABLE_DEBUG=ON
) then this segfault will be replaced with a nice exception, including a stack trace if you have BinUtils enabled.
That was Mark's tip too. I'll try to enable this in the Debian build. (A binutils bug will have to be resolved first.)
A binutils bug will have to be resolved first
BinUtils is only needed if you want a stacktrace without running in a debugger. But if you build a full debug version of the code (e.g. -DCMAKE_BUILD_TYPE=DEBUG
) then you can run your code in a debugger and set a breakpoint when the throw occurs. See documentation for TEUCHOS_TEST_FOR_EXCEPTION()
). In this case, you don't really need BinUtils. But for non-deterministic errors, the stacktrace produced but BinUtils is super useful.
When using Thyra with Ifpack2,
initializePrec
segfaults.MWE:
Output:
Backtrace: