Closed lindsayad closed 5 years ago
This is going to take a while to dig through, but the first thing that screams at me is naming choice. We'll now have a SparseNumberBase that's the base class for DynamicSparseNumberFoo, while at the same time completely unrelated to SparseNumberFoo? Way too misleading. Maybe "Semidynamic" or "Preallocated" or something for the "not dynamic allocation but also not entirely compile-time" option?
The name change is also wrecking these diffs. Whatever we end up at for the name, can we please keep the "actual functionality changes" and the "search and replace name change" broken into two separate commits? I'd like to check the former by reading every line, whereas I'd like to ignore the latter by assuming you can use a regexp correctly. :-D
Maybe "Semidynamic" or "Preallocated" or something for the "not dynamic allocation but also not entirely compile-time" option?
That's a good name for my planned std::array
derived class, but I'm at at loss for the "most base" class. I agree that SparseNumberBase
is unfortunate and I should change it...but I need a good replacement for it.
Whatever we end up at for the name, can we please keep the "actual functionality changes" and the "search and replace name change" broken into two separate commits?
There shouldn't actually be any functionality changes in the code movement from DynamicSparseNumberBase
to WhateverWeEndUpNamingTheMostBaseClass
. I don't plan to change any lines of code other than things like replacing std::vector<T>::iterator
with auto
. I can still probably split out those changes from the renaming though.
@roystgnr I renamed SparseNumberBase
to DynamicSparseNumberAbstract
. I'm happy with where this is
Hmm... I guess "Abstract" works better than "Base" as long as we've got pure virtuals there.
But wow, should we really have pure virtuals there??? I realize in hindsight that this is all my fault in the first place, for not getting rid of the virtual in safe_bool or at least declaring the derived constructors final, but now that I look at it, it feels completely insane to have a virtual function call in a destructor for a small object intended to be used in temporaries in innermost kernel loops!
Could you try your MOOSE AD benchmark with all the destructors marked non-virtual (should be safe, we'll never be deleting from a pointer to base class here, right?) and see if it makes any noticeable difference?
Alright, the newest commit rips out all virtuals (why I had them in the first place was probably because I had some other design in mind), which of course now begs the question of why even have the new class. The only argument I see is to maintain the signature of DynamicSparseNumberBase
. Is that important to you @roystgnr ?
With the virtuals removed, the Jacobian calculation results in #1 are the fastest yet, although it could just be within error.
which of course now begs the question of why even have the new class
... because they have different behaviors? If DynamicSparseNumberArray is way slower but allows run-time selection of maximum index size, and SemiDynamicSparseNumberArray is way faster but forces you to set and preallocate an upper limit, it seems like you can't get rid of either without losing a valuable option. Am I misunderstanding the question?
With the virtuals removed, the Jacobian calculation results in #1 are the fastest yet, although it could just be within error.
Could you post the updated numbers?
Am I misunderstanding the question?
Sorry that was indeed confusing. We definitely want the derived class SemiDynamicSparseNumberArray
but this PR also introduced the most-base class DynamicSparseNumberAbstract
(which is no longer abstract). So we currently have:
DynamicSparseNumberAbstract
<- DynamicSparseNumberBase
<- DynamicSparseNumberArray(Vector)
DynamicSparseNumberAbstract
<- SemiDynamicSparseNumberArray
But I think we could just get away with:
DynamicSparseNumberBase
<- (Semi)DynamicSparseNumberArray(Vector)
If we change the template parameters for DynamicSparseNumberBase
to be those of what is currently DynamicSparseNumberAbstract
Ah, got it. Yeah, the latter seems more reasonable if you can pull it off. Sorry about the hassle (and about setting a bad example with the previous virtuals).
Sorry about the hassle (and about setting a bad example with the previous virtuals).
I never think of it as a hassle to get things right!
Paring of results from #1. Timings are of SNESComputeJacobian
(parantheses are _platform_memmove
) with 200x200 mesh using MOOSE's ad_lid_driven_stabilized.i
input file:
N
= 50: 13.3 s (2.79 s)N
= 100: 12.47 seconds (1.82 s); this timing seems pretty more or less independent of N
Looks like you're correct on both parts of "fastest yet, although it could just be within error". But if it's not just random noise, going from 13.38 down to 12.47 isn't shabby; let's leave the virtuals off.
And of course going from 150 down to 12.47 is huge, as is going from 13.3 to 12.47 while adding flexibility. So I'm definitely liking the outcome here. Do any cleanup you want to do on the branch (I see that fixup commit, and of course getting rid of the 'temporary' renaming would be nice if it's not too impractical) and I'll try to finish reviewing tomorrow.
It was easy to drop the Abstract
class.
Sigh...I was very fastidious about turning off my formatting save hook, but it looks like not fastidious enough. Especially with being able to just change the template parameters for DynamicSparseNumberBase
, this should actually be a relatively small diff. I'm gonna go back and clean up those diffs.
Alright, fixed up the diff; looks pretty clean
I was going to complain that this breaks C++98 support, but double-checking master, we broke C++98 support a while ago. You get what you CI test, I guess, huh?
I was going to complain that this breaks C++98 support, but double-checking master, we broke C++98 support a while ago.
It wouldn't be hard to make this PR C++98 compatible. I know the NotADuckDualNumber
stuff heavily leverages C++11, but it could be guarded.
It wouldn't be hard to make this PR C++98 compatible.
Really? At first glance it looked like there were a lot of quick fixes (auto -> WhateverSubclassTemplate::iterator, etc), but you're also relying on variadic templates pretty heavily, right? I didn't see any way around that.
If you can actually keep all this stuff C++98 compatible that would be great, but that's not a show-stopper; other than the duplicate file you just removed this looked okay to merge as-is.
Really? At first glance it looked like there were a lot of quick fixes (auto -> WhateverSubclassTemplate::iterator, etc), but you're also relying on variadic templates pretty heavily, right? I didn't see any way around that.
You're right. The variadics are definitely a central feature of this PR ☹️ I was only thinking of the auto
stuff.
We're probably going to change our derivative type in MOOSE to the SemiDynamicSparseNumberArray
. I think this is a big enough addition to warrant a minor version bump, but I also question the frequency with which I've been doing releases (almost a release per PR). So I'd be happy to wait for some more content before doing a version change, in which case I'd make a boostrapped PR branch.
When do you think MetaPhysicL will be ready for version 1??
I think this is a big enough addition to warrant a minor version bump, but I also question the frequency with which I've been doing releases (almost a release per PR). So I'd be happy to wait for some more content before doing a version change, in which case I'd make a boostrapped PR branch.
Honestly? This stuff is great, certainly worth calling it 0.6.0.
When do you think MetaPhysicL will be ready for version 1??
I think there's not too much stuff I'd want on my must-have list: fixing up the guards around C++11 and C++14 dependent bits, adding move optimizations to everything ... Optimizations for nested DualNumber (and ideally DualExpression) might be nice, but aren't critical. I think the biggest thing on my wishlist is automated performance testing, though. We really need something like a "make benchmark" that we can just run right off the bat to test proposed changes for improvements or regressions, without anyone having to rebuild (or for other users, download and build!) libMesh and MOOSE first.
An update on this test case:
ad_lid_driven_stabilized.i
Linking to idaholab/moose#13963 which also contains some discussion on relative speed of NumberArray
vs SemiDynamicSparseNumberArray
DynamicSparseNumberBase
intoDynamicSparseNumberAbstract
SemiDynamicSparseNumberArray
which has the same sparse operations as its pure dynamic counterparts but usesstd::array
as the backing storage