Closed kutschke closed 2 years ago
There are at least 14 of these issues in Mu2e Offline. Here is the worst case:
The issue is found inside a helper function that is used in the intiailizer list.
Clearly I need to go back to rvalue reference school.
I don't think that I have fully wrapped my head around rvalues and rvalue references as used in this case. I would appreciate it if you can help me sort it out.
First, in TFileDirectory, only one c'tor is specified. It's a 4 argument c'tor and it's protected. I don't see any declaration of default c'tor, the copy c'tor or copy assignment, nor the move c'tor and move assignment. Does making the one declared c'tor protected, also make all rule of 5 functions protected?
What about the default c'tor? I have a vague recollection that it's implicitly deleted if you provide any non-default c'tor. Do I remember correctly?
The function TFileDirectory::mkdir returns a TFileDirectory object by value. The caller cannot receive the return by value because there is no public copy c'tor or public copy assignment. So the returned object must be an unnamed temporary. If the caller were to receive the return value as an lvalue reference, there would be a memory leak because an lvalue reference never deletes its referent and there is no other mechanism to delete the temporary. Therefore the language forbids it. The caller may receive the return value as an rvalue reference because an rvalue reference will delete it's referent when it (the rvalue reference) goes out of scope. So that works. Is that story right so far?
Now for more confusion. As best I can there is no move c'tor or move assignment - so why does return by rvalue reference work? Does the compiler treat the publicness of these functions differently than it does the copy c'tor and copy assignment? Or does it bypass those functions entirely because it can construct the temporary in place by RVO? Or ... ?
And one side issue. I notice that TFileDirectory is a base class that is inherited by TFileService. However TFileDirectory does not have a declared virtual d'tor. Last I heard ( a long, long time ago) a base class was required to provide a virtual d'tor or bad things could happen. Have the rules changed?
I have done some more homework and received some input from Ray. So I now understand the questions raised in this comment. I am leaving it here for reference.
------------------ original comment below ---------------------------
I read some more about rvalue references and learned that you may bind an rvalue to a const lvalue reference. I have made two small examples to play with this:
https://github.com/kutschke/TFD
See the README. So my question is: is my example TFD2_module.cc legit c++ ?
With some input from Ray I have been able to focus on the core problem. This expression:
std::string name = ... ; artFileDirectory& topdir = ...; art::TFileDirectory tfdir = name.empty() ? topdir : topdir.mkdir(name.c_str());
no longer compiles. I understand that a copy is needed when name is empty - and there is no copy c'tor. So it fails. Hopefully the intent of the code is clear. After doing my homework I expected the following to compile:
art::TFileDirectory const& tfdir = name.empty() ? topdir : topdir.mkdir(name.c_str());
I based this expectation on the fact that the following two lines do compile: art::TFileDirectory const& tfdir = topdir; art::TFileDirectory const& tfdir = topdir.mkdir(name.c_str());
Note that these are the two components in the ternary expression. As it happens, the ternary fails with the error that there is no public copy c'tor or copy assignment, the same as before.
I think I understand and would like to know if I am right. I think that each of the 3 expressions in the ternary operator are their own scope - is that true? If so the temporary returned by topdir.mkdir(name.c_str()) will go out of scope and be deleted immediately. Therefore there is nothing to bind to lhs. Therefore compiler decides to make a second unnamed temporary that is a copy of the first unnamed temporary but having a lifetime matched to the scope of the lhs. If the compiler were smarter it could elide the copy and adjust lifetimes. But it's not that smart and there is no way to do the copy. Is that analysis correct?
So the Catch 22 is that topdir and the temporary returned by topdir.mkdir(name.c_str()) have different lifetimes. So there is no way to bind them to a single lhs object.
As best I can tell, we are left with 3 options.
Code fragment for option 2: If ( name.empty() ) { // do the work (topdir); } else { art::TFileDirectory tfdir = topdir.mkdir(name.c_str()); // do the work (tfdir) } If the “do the work” is more than 1 line we can encapsulate it in a function. Option 3 is straightforward but tedious and error prone.
@kutschke, thank you for the bug report. I understand the problem. I will investigate solutions today.
@kutschke, there is another option you might consider:
- art::TFileDirectory tfdir = name.empty() ? topdir : topdir.mkdir(name.c_str());
+ art::TFileDirectory tfdir = topdir.mkdir(name);
If name
is empty, tfdir
will refer to the same directory as topdir
--that's because the argument to mkdir
is interpreted relative to topdir
. Can you try this and see if it works? You'll still need to pass in the topdir
argument by reference instead of by value, but perhaps that's sufficient.
A couple points:
TFileDirectory
contains a mutex data member that is not copyable and the user-defined copy constructor was just ignoring it.@kutschke, there is another option you might consider:
- art::TFileDirectory tfdir = name.empty() ? topdir : topdir.mkdir(name.c_str()); + art::TFileDirectory tfdir = topdir.mkdir(name);
If
name
is empty,tfdir
will refer to the same directory astopdir
--that's because the argument tomkdir
is interpreted relative totopdir
. Can you try this and see if it works? You'll still need to pass in thetopdir
argument by reference instead of by value, but perhaps that's sufficient.
Thanks Kyle. I will try it and let you know the result.
The code compiles but I have not yet tested that it works. I think I know how to make a standalone test that does not require making all 14 edits.
I am confused about one thing. Suppose that the string is empty. It will create a new TFileDirectory object pointing to the same underlying TDirectory. Presumably I can access that one TDirectory using either object. Does the mutex protect against that? For our code, as it exists today, there can never be a race condition since the relevant parts of our code are single threaded. I am asking in case this needs to be on our radar in the long term future.
The code also passes my standalone tests. I will pass this on to Mackenzie so that she can change it where needed, complete the port to art v3.10 and test.
Please close the ticket - we will open a new one if there are follow-on issues.
Glad to hear, Rob. Thanks for the update. Incidentally, I will open a new issue report:
Unfortunately, each TFileDirectory
object has its own mutex--this is a hole in the current system. Anyone who interacts only with TFileService
uses the same mutex (as TFileService
is a derived class of TFileDirectory
). However, once a separate TFileDirectory
object is created, a separate mutex is created also. This must be fixed.
Comment added Dec 12, 11:47 AM: I have been working on this myself with some input from Ray. You might want to jump straight to the 5th messsage.
Describe the bug In art suite 3.10 the public c'tors of TFileDirectory have been removed. This has knock on effects in Mu2e code. I have a fix that works but it's very intrusive. I would like to understand if a less intrusive solution is possible.
To Reproduce In a clean directory that sees the mu2e cvmfs area:
This fails with the error:
To build the version that works
To see the differences between the original code and the fix:
The issue is in the .cc file. I would like to keep the structure of the original code in which we define an object that is/refers to/points to the correct directory; then have one block of code to make the histograms. I have not been able find a type that can refer to either topdir or to the return value of mkdir. It's not a big deal for a simple case like this but, if we need to create many histograms, it would be awkward. I suspect there will be many cases like this in our code base; there are at least 2.
If this is a high-priority issue Yes. This is high priority but I am not able to choose labels.