Nick's work. - Githubissues

Reported by nick on 20 Sep 2011 16:11 UTC I'm not sure at what point I last pushed my changes, but now that I've implemented depth cuing and surface normal highlighting, it seems a good time to request a code review.

The code is in ~Nick/hemelb. Please use revision 5b24c4caba91

Cheers,

Nick

Comment by hywel on 20 Sep 2011 17:59 UTC Please attach the diff here yourself (then we don't have to rely on everyone being on the same network). Something like "hg diff -r parent > ~/Desktop/diff.diff".

Since your work has a lot of changes and we haven't reviewed the design or the code so far, can you use a diff from acb13ad75950 (which I believe is the uber-parent revision of all your work).

It would also be helpful if you described the end-goal of your work, how you've designed it and what testing you've done.

Thanks!

Comment by nick on 21 Sep 2011 10:45 UTC I'm afraid that there's been multiple merges occurring since then, so the diff file contains a lot of unnecessary content).

You'd be better off doing a code review on the whole of the following files:

vis/rayTracer/*
vis/Traverser
vis/Vector3D.h
vis/Viewpoint.*

End goal - make ray traced images appear more 3D Design - a ray now contains a derived RayData object, containing implementation details that differ depending on how the data contained in the voxels should be processed. The details for each site are input into this object. Merging occurs on the RayData and the colours are then obtained from each ray at the end. Testing - running diffTest and memTest - I'm afraid the code isn't separated enough in order to run units tests on it.

Best wishes,

Nick

Comment by hywel on 21 Sep 2011 15:43 UTC Presumably your new code changes the images significantly? If so, make sure the final committed version either uses the original !RayTracer or has updated diffTest images (which we'll need to see first). No worries about the unit-tests for now.

Overall comments: most of the changes are sensible but wrong include guards, comment typos, wrong namespaces are happening a lot. Also, lots of files include the assert.h header and #define NDEBUG. Please remove all of these before the next code review - some of them were there in the previous one and these reviews take a long time as it is.

Cheers

Vector3D

Include guards are the wrong format (should be HEMELB_VIS_VECTOR3D_H)
The indentation is wrong too - using emacs is fine, but it needs to conform to the same formatting as you'd get using eclipse's autoformatter. A big part of that is using 2 spaces instead of a tab.
converstion is a typo
is there a reason += doesn't return a Vector object?
produt is a typo
I think you should cast the multiplier to the type of T before you multiply.
!UpdateMinVector3D and !UpdateMaxVector3D aren't nice names. !PointwiseMin and !PointwiseMax might be nicer. I'd also suggest either making the method non-static and have it update the vars in place OR have it return a new Vector3D with the new values.
couldn't operator*(const T left...) be more general if you template on the left-hand parameter too?
I think this class probably belongs in the utils namespace.

Viewpoint.h

Formatting again. I'm guessing this is the case in all the files. I'd also say that comments are easier to read if there's a space after the '//'.
Typo: coordintes, Word, iThetaY -> iSinThetaY
Is there a reason mDistanceFromEyeToScreen can't be private? If so, you should put in a comment.

Viewpoint.cc

Typo: reflectiond, oringinal
The naming switches between camera and eye - we should use one or the other, I think.
In general, much cleaner code than I remember it being, which is good.

XYCoordinates.h

Include guard is wrong.
Couldn't this whole class be combined with Vector3D to make a more general template template Vector?

!VolumeTraverser.h & cc

There's lots to like about this code.
All these traverser files look like they're really more about the lattice rather than about visualisation. That puts them in the hemelb::geometry namespace (which would encourage reuse in all kinds of other places)
The include guard is wrong
I'm not keen on the double-line-breaks.
Can you make the constructor protected?
I haven't looked at the lattice data code but it'd be nice if it had some !GetTraverser methods.
Actually, given that it assumes a specific vector->scalar transformation to get the index of a site / block, it should be somewhere near the !LatticeData which assumes the same mapping.

!SiteTraverser.h & cc

!SiteTraverse is a typo, ditto speficied
Why does the constructor take a !LatticeData reference, rather than a pointer? The pointer version would make more sense here.
Include guards...
The cc includes the wrong header file.

!BlockTraverser.h & cc

Why not use a pointer to the !LatticeData? I think it makes more sense.
!GetSiteCoordinatesOfLowestSiteInCurrentBlock might be nicer as !GetFirstSite
The #define and the include of assert.h in the cc are unnecessary.

!BlockTraverserWithVisitedBlockTracker

Wrong namespace
The class comment has some typos, and should use the /\ comment style (for auto-documentation purposes)
Later comment in the header file has typos too
I haven't checked, but are your destructors virtual in the base classes?
The cc has a #define and an include that shouldn't be there
The cc include !RayTracer.h which it probably doesn't need
Take out the asserts.
I like the idea of the class, just fix this stuff and put it in a more reusable place.

!SiteData

Include guards don't match namespace
Struct name doesn't match file name
Why don't we encapsulate more? I.e. make the member variables private, add public get functions and a 3-parameter constructor to set the member vars.

TestHSLToRGB

This would be better as a more automatable test - if you have a set of values to try to convert and compare them against the known answer, we can later make that a unit test when you merge with other code.

HSLToRGBConverter

Namespace vs location vs include graphs
You can probably just call the function Convert, as the class name tells you the 'from' and 'to'.
NDebug, Assert
The method seems quite over-engineered, and there are few comments. It's not clear to me why we can't use floats, nor why we have to specify how many bytes each int is. The wiki page you link to makes the conversion look much simpler.

!RayData

Namespace vs include guard
Assert
Where is NO_VALUE_F defined? Does it have a sensible value?
Typo: Cals, ///
!UpdateDataCommon might be better as !UpdateRayDataCommon

!RayDataNormal

"Passed as references" comment - can't see what this applies to
Typo: represting
NDEBUG, assert
Can you put a BIG Todo comment somewhere saying that having !ColPixel templated on the ray type is a temporary measure until we split up the pixel across the different drawing classes.
!RegisterMpiDataType creates a datatype every time it's called, which is never freed = memory leak. I realise this probably wasn't your design, but I'm afraid you've got to fix it! We'll have to do some kind of static initialisation and freeing of a static type, and return it on every call to the function.

!RayDataEnhanced

Class comment has wrong format and typo
I agree with your TODO - is it possible to do this now or is another bit of code in the way?
Do we definitely want to do the "fabs(lDotProduct)"? Have you tried versions where the back wall of a vessel isn't displayed?
There's a few magic numbers here which should be constants somewhere.
Is the cc neccessary?
Are all 3 MPI datatype implementations necessary? The same memory leak problem will apply as before and, I'm afraid, you've got to fix it again!

Ray

Namespace vs include guard
assert include
It's not clear what mInWall does and why, sometimes, calling !UpdateDataForWallSite sometimes calls !UpdateDataForNormalFluidSite. Please add a comment.

!RayEnhanced.cc

This file looks like it shouldn't be there.

!ColPixel

Just had a quick glance - it has a lot of commented-out code that should be removed.

!RayTracer

Include vs namespace
NDEBUG and assert
No need to include the debugger
The header file includes itself
Much cleaner separation of the functionality now, which is nice.

!ClusterRayTracer

Include / namespace
NDEBUG and assert
This should take pointers, not addresses.
Lots of long parameter names, these will look nicer if you can think of shorter versions (not always easy / possible).
Rather than "//(Remember that iRay.mDirection is normalised)", can we just make the function be called "iRay.!GetNormalisedDirection()".
Typo: verticies
The repetition of !UpdateSubImageExtentForCorner could be avoided by using an array of mins and maxes of the cluster and looping.
!SubImageOffScreen could become a single return statement.
We don't need the comment: "//This used to be AABB" - the main point is what the code is now!
!UpdateDataForNormalFluidSite should take a pointer to the site data
Ditto the arguments to !CalculateRayUnitsBeforeNextVoxel
Some of these functions could be 'const' - that helps the compiler optimise.
!TraverseBlocks has good commenting which stops halfway through.
If you used Vectors, and defined the values in your Direction enum, you could get rid of your switch statements (at least one, anyway).
There are a lot of static casts that could be simpler (int)-style casts that would give clearer code.
The comment on the last line is wrong.

!ClusterShared

!DoGetBlockIdFrom3DBlockLocation repeats an assumption that's in other places (like the !LatticeData), which should be combined somewhere.

!ClusterTraverser

Include guard != namespace
This should keep a pointer
Would it be useful to have a method to return info about the current location in the cluster? Would that simplify some of the classes that use it?

!ClusterNormal

NDEBUG and cassert
It seems weird that this class even has the !DoGetWallData and its 'set' equivalent. This probably means that the encapsulation needs to be higher up the stack somewhere.

!ClusterBuilderWithWallNormals

Is this file meant to be there? I can't see a corresponding cc.

!ClusterBuilder

Includes the debugger, assert and NDEBUG
The formatting is strikingly different from eclipse's (and, therefore, the rest of the code's)
There's an extra "private:" specifier
Well commented class. !FindNewCluster comment isn't complete.
Typo: interget
All the logging here is going to create a lot of info, on a low debug level. Does it need to be outputted at all? If so, only on log::Debug please. Rather than using asserts, you could output debug information if a constraint isn't met, and only on Debug log-level. That way, we'll get minimal, only helpful output.
Can this class use one of your traversers instead of the normal 3-level loop?
"insertion of copying" should probably be "insertion or copying", right?
I can't see where !GetSiteCoordinatesOfBlock is defined.

Comment by hywel on 22 Sep 2011 17:48 UTC Thinking about it, don't worry about fixing the memory leaks - I'm working on splitting up the !ColPixel at the moment which will make that way easier. Just in case I take ages, it might be worth you putting a big 'todo' in your code, acknowledging that there's a leak.

Comment by nick on 23 Sep 2011 11:06 UTC I'm still working on the code review but didn't quite understand the leaks.

Separation is all well and good, but do you realise that it'll result in more data being sent over the network? Better I would have thought would be to streamline the ColPixel class into the minimum components and have other classes work on it.

Comment by hywel on 23 Sep 2011 11:39 UTC No worries. The leaks occur because every time we request the MPI datatype for transferring a !ColPixel we create one and call MPI_Type_commit. This tells MPI to remember the 'shape' of the type. We never call MPI_Type_free on the same types so that dynamically allocated memory is never freed. It's a small amount of memory but it's done every time we send pixels, which is often (when steering).

It might result in a bit more data being sent. Overall I think it'll result in less. Each sent pixel will only be as large as it needs to be (no streakline variables sent if we're not drawing them). We duplicate data on each core and the first version will communicate things like pixel location multiple times. It'll be easy to change that in future versions if it ever becomes limiting. This means a bit more computation per core, but it's small wrt the LBM computation and that's often the price paid for flexible functionality.

Comment by nick on 23 Sep 2011 12:14 UTC That makes sense, but if I understand Rupert's Code, the Register method should only be called once for every data time in the life time of the application:

if (mpiType == MPI_DATATYPE_NULL) { mpiType = RegisterMpiDataType(); }

Would it be possible to have a singleton for MPI that keeps track of types and frees them on termination of the application?

Comment by hywel on 23 Sep 2011 13:08 UTC My bad, didn't spot that. So there's only a small memory leak instead.

The singleton would be the only way to keep the current pattern (though it'd only have to be for MPI datatypes, not for all the MPI stuff). But this might be unnecessary which is why I suggested holding off. We could end up with a system that uses a different data type on each communication cycle, or one which creates a single type but does it dynamically, which would make the singleton unnecessary.

Comment by nick on 23 Sep 2011 13:43 UTC Replying to hywel:

The singleton would be the only way to keep the current pattern (though it'd only have to be for MPI datatypes, not for all the MPI stuff). But this might be unnecessary which is why I suggested holding off. We could end up with a system that uses a different data type on each communication cycle, or one which creates a single type but does it dynamically, which would make the singleton unnecessary.

An interesting idea - so put a header which indicates the types and number of objects so the receiving end can create the same type?

Comment by hywel on 23 Sep 2011 13:55 UTC Not a header; you have to know the length of the whole message before you can receive any part of it. All we can do is send data one iteration in advance that tells another proc what'll come on the next iteration (unless we can get a way where the proc's own data is sufficient for it to work that out).

Comment by nick on 23 Sep 2011 17:11 UTC General

While carrying out the changes, I accidently introduced a bug which an assert statement caught straight away. Since we have no unit test I have therefore chosen to keep them in until it can be gauranteed that I won't be changing the code or any code which uses code which currently has an assert statment - sorry but I don't have time to spend a whole day fixing a bug I could have spotted in seconds.
The diff is prior to reformatting using eclipse, for clarity.

Vector3D

is there a reason += doesn't return a Vector object?
I had assumed it didn't need one, but having looked it up, I can see why it should have one.
I think you should cast the multiplier to the type of T before you multiply.
Correct me if I'm wrong, but I think that if the Vector3D was teplated to int, and the multiplier was a float, the current code would do a floating point multiplication, then cost the result to int, which may be preferred.

Viewpoint.cc

Removed comment about previous code based on later comment

XYCoordinates.h

Couldn't this whole class be combined with Vector3D to make a more general template template Vector?
Probably, if you're willing to use a union shudders or auto generated code other than with templates
Class kept in the vis namespace as I don't think anywhere else in hemelb can make use of a 2D class

HSLToRGBConverter
The method seems quite over-engineered, and there are few comments. It's not clear to me why we can't use floats, nor why we have to specify how many bytes each int is. The wiki page you link to makes the conversion look much simpler.
Floats can be used, but unsigned integers have been preferred for more even precision. Sizes have been specified since C++ gives no guarantees about the size of types.
All the maths carried out in the code matches that on the wiki page

!RayData
Where is NO_VALUE_F defined? Does it have a sensible value?
- constants.h - const float NO_VALUE_F = std::numeric_limits::max();
- I've got rid of it now - the code that relies on it appears to have since been refactored by me.
!RayDataNormal
Can you put a BIG Todo comment somewhere saying that having !ColPixel templated on the ray type is a temporary measure until we split up the pixel across the different drawing classes.
Splitting the pixels types is a very good idea, but whichever class takes over the consolidations of all three pixels types into a single colour from ColPixel will have to be templated on the ray type. So really this isn't a temporary measure - just that ColPixel can be refactored into 4 classes, two of which will remain templated including the one which can legimitimately be called ColPixel. The only alternative is run-time polymorphism.
- !RegisterMpiDataType creates a datatype every time it's called, which is never freed = memory leak. I realise this probably wasn't your design, but I'm afraid you've got to fix it! We'll have to do some kind of static iniialisation and freeing of a static type, and return it on every call to the function.
- I don't understand the leak? The function should only be called once. Everything I create in the MPI_Datatype specialisations is on the stack and the returned type gets saved to a static member of the templated MpiDataTypeTraits class, which will get cleared up on termination. If it's a side effect of one of the functions I call, some sort of singleton will have to be implemented to provide clean-up.
!RayDataEnhanced
Do we definitely want to do the "fabs(lDotProduct)"? Have you tried versions where the back wall of a vessel isn't displayed?
Looking down the tube is no longer possible with the diffTest, and the end of the tube doesn't show up. Obviously the rest of the tube looks fine, but assymetric blood vessels won't look as 3D. Is an extra if statement really going to be cheaper than fabs and a multiplication? I realise fabs isn't quite as efficient as changing the sign bit as it does some checks for special values, but I presume the branch prediction means that the hit is tiny.
I agree with your TODO - is it possible to do this now or is another bit of code in the way?
Done, the division is moved to the Get_Colour methods, this has the impact that GetAverage__ methods will change. Thus far, this isn't a problem, as those methods aren't used. But if you were later to use them, perhaps for things like being able to click on a pixel and getting a velocity, the division won't be done. To be honest, though, if we are going to report absolute velocities, much more will need to be looked at than this. I've had to do a horrible hack for RayDataNormal, as I took out the passing of the DomainStats object from the RayTracer. The only way to fix this, would be to change the method used for coming up with a colour in RayDataNormal to calculating at the end, rather than every voxel site and then mixing in the colours and taking an average - kinda like a RayDataEnhanced. Since RayDataEnhanced runs quicker than RayDataNormal, I expect RayDataNormal will get removed, or replaced by a version of RayDataEnhanced without a lighting?
Is the cc neccessary?
I think it is, as otherwise the static consts and template specialisations would get redefined.

!ClusterRayTracer
This should take pointers, not addresses.
I realise that both references and pointers have duplicated functionaly and both have the same problems. In this case, references have been preferred because a) they're gauranteed not to be NULL,and b)I don't carry out any pointer arithmetic. In general I will use references where possible because there's no need to perform NULL checks.
Lots of long parameter names, these will look nicer if you can think of shorter versions (not always easy / possible).
I thought the same, but I can't think of shorter versions. The long ones are vectors, and describing the two ends of the vector is wordy.
Rather than "//(Remember that iRay.mDirection is normalised)", can we just make the function be called "iRay.!GetNormalisedDirection()". The direction itself is not normalised - just expressed by a normalised vector. In this function, the GetDirection function need not be normalised
GetInverseDirection relies on mDirection having been normalised, but is itself very much not normalised! The comment is more there for anyone looking at code who thinks "What's happening here? / how can you do this?". I could of course call the functions GetNormalisedDirectionVector() and GetInverseNormalisedDirectionVector() but this seems excessive.
!UpdateDataForNormalFluidSite should take a pointer to the site data
Same argument as before SiteData_t cannot be null. ClusterRayTracer knows it can't be null (though maybe an assert statement should be used just in case), as so is happy to dereferebce it
Some of these functions could be 'const' - that helps the compiler optimise. needed
None of the public functions can be declared const, so IMO there's no need to declare the private/protected functions const, even if they could legitimately be declared const.
If you used Vectors, and defined the values in your Direction enum, you could get rid of your switch statements (at least one, anyway).
Just realised I could get best of both by moving the enum to the Vector3D class.
There are a lot of static casts that could be simpler (int)-style casts that would give clearer code.
I prefer to use C++ style casts for consitancy (I realise I've left a lot of C style casts in).
Are all 3 MPI datatype implementations necessary?
I think so - I don't we can template template specialisations.

!ClusterShared
- !DoGetBlockIdFrom3DBlockLocation repeats an assumption that's in other places (like the !LatticeData), which should be combined somewhere.
- True, but the only similarities are that they're 3D block structures. If you want I could create a common (perhaps templatable) class for this, for both !ClusterShared and !LatticeData.
!ClusterTraverser
This should keep a pointer
Again, null case not valid and no pointer arithmetic
Would it be useful to have a method to return info about the current location in the cluster? Would that simplify some of the classes that use it?
These are inherited.

!ClusterNormal
It seems weird that this class even has the !DoGetWallData and its 'set' equivalent. This probably means that the encapsulation needs to be higher up the stack somewhere.
I can't remember whether it was you or Rupert who suggested this approach, but I think it was established that there doesn't seem to be a better way of doing this except #ifdef WALL_DATA

!ClusterBuilder
All the logging here is going to create a lot of info, on a low debug level. Does it need to be outputted at all? If so, only on log::Debug please. Rather than using asserts, you could output debug information if a constraint isn't met, and only on Debug log-level. That way, we'll get minimal, only helpful output.
- Forgot about that - now changed it to Debug. I can't change the asserts to log outputs because the assert command uses clever macroing to print out the file and line number. I could write a user defined assert as part of the logger which compiles them out when not running on the debug log level. Do you want me to do this?
Can this class use one of your traversers instead of the normal 3-level loop?
"insertion of copying" should probably be "insertion or copying", right?
I can't see where !GetSiteCoordinatesOfBlock is defined.

Comment by hywel on 26 Sep 2011 08:46 UTC Replying to nick:

General

While carrying out the changes, I accidently introduced a bug which an assert statement caught straight away. Since we have no unit test I have therefore chosen to keep them in until it can be gauranteed that I won't be changing the code or any code which uses code which currently has an assert statment - sorry but I don't have time to spend a whole day fixing a bug I could have spotted in seconds.

Maybe this is the problem then - using gdb, you can normally find and fix such bugs in the same amount of time. If you want, we can sit down together some time and talk about how to use gdb effectively, and I can show you how it works. Some people also recommend getting someone else to read through your code - Rupert found a lattice-Boltzmann bug (which I'd introduced) for me within about 2 minutes.

Amongst reasons to not use assert statements are that they don't diagnose or output what's wrong in the way logging does, they don't tell you anything about the root cause, they clutter the code with statements about what you didn't intend it to do (not necessarily what it's not able to do) and they kill the programme on every error (imagine a production scenario where we'd enable something like assertions to ensure our I/O was successful and that there were no injections in the HemeLB parameters and other stuff - the programme would also die if the image went too far blue which really isn't that big a problem).

The GeometryReader was having problems in production runs that we didn't get on smaller local runs, so now has code that is only run on a certain logging level that tests all of our assumptions about our reading of the input file, and displays a human-readable error message when appropriate. You could do something similar here if you don't want to face debugging in gdb.

The diff is prior to reformatting using eclipse, for clarity.

Then it's not ready for review!

Vector3D

is there a reason += doesn't return a Vector object?

I had assumed it didn't need one, but having looked it up, I can see why it should have one.

I think you should cast the multiplier to the type of T before you multiply.

Correct me if I'm wrong, but I think that if the Vector3D was teplated to int, and the multiplier was a float, the current code would do a floating point multiplication, then cost the result to int, which may be preferred.

Yup, I think you're right. Probably the most sensible to thing is to do no casting and allow the C++ type hierarchy to handle it.

Viewpoint.cc

Removed comment about previous code based on later comment

XYCoordinates.h

Couldn't this whole class be combined with Vector3D to make a more general template template Vector?

Probably, if you're willing to use a union shudders or auto generated code other than with templates

Really? Couldn't you just have T xs[instead ofT x, y, z;``` You'd have to get rid of the access with a dimension enumeration, instead using a dimension int.

Class kept in the vis namespace as I don't think anywhere else in hemelb can make use of a 2D class

These changes aren't about how they compile, they're about explaining to another programmer what your code does (cf. Literate Programming). This code is a useful little class that isn't specifically about visualisation, therefore it belongs in the util namespace.

HSLToRGBConverter

The method seems quite over-engineered, and there are few comments. It's not clear to me why we can't use floats, nor why we have to specify how many bytes each int is. The wiki page you link to makes the conversion look much simpler.

Floats can be used, but unsigned integers have been preferred for more even precision. Sizes have been specified since C++ gives no guarantees about the size of types.

All the maths carried out in the code matches that on the wiki page

No doubt, but if you choose to change it this much, you're going to need to add a lot more comments to explain what you're doing.

!RayData

Where is NO_VALUE_F defined? Does it have a sensible value?

constants.h - const float NO_VALUE_F = std::numeric_limits::max();

I've got rid of it now - the code that relies on it appears to have since been refactored by me.

!RayDataNormal

Can you put a BIG Todo comment somewhere saying that having !ColPixel templated on the ray type is a temporary measure until we split up the pixel across the different drawing classes.

Splitting the pixels types is a very good idea, but whichever class takes over the consolidations of all three pixels types into a single colour from ColPixel will have to be templated on the ray type. So really this isn't a temporary measure - just that ColPixel can be refactored into 4 classes, two of which will remain templated including the one which can legimitimately be called ColPixel. The only alternative is run-time polymorphism.

There doesn't '''have''' to be a class like that, as we're not tied into treating the different kinds of ray-tracer as mutually exclusive any more. Hence there are other alternatives between run-time polymorphism. Either way, this will definitely be a temporary measure.

* !RegisterMpiDataType creates a datatype every time it's called, which
is never freed = memory leak. I realise this probably wasn't your design,
but I'm afraid you've got to fix it! We'll have to do some kind of static
iniialisation and freeing of a static type, and return it on every call
to the function.
- I don't understand the leak? The function should only be called once. 
  Everything I create in the MPI_Datatype specialisations is on the stack 
  and the returned type gets saved to a static member of the templated 
  MpiDataTypeTraits class, which will get cleared up on termination. 
  If it's a side effect of one of the functions I call, some sort of 
  singleton will have to be implemented to provide clean-up.

Yup, we talked about this elsewhere, I think. For now, a TODO will do as it's such a small leak.

!RayDataEnhanced

Do we definitely want to do the "fabs(lDotProduct)"? Have you tried versions where the back wall of a vessel isn't displayed?

Looking down the tube is no longer possible with the diffTest, and the end of the tube doesn't show up. Obviously the rest of the tube looks fine, but assymetric blood vessels won't look as 3D. Is an extra if statement really going to be cheaper than fabs and a multiplication? I realise fabs isn't quite as efficient as changing the sign bit as it does some checks for special values, but I presume the branch prediction means that the hit is tiny.

Eh? Why is looking down the tube impossible? Re: if / fabs, it depends on ratio between how many times each is called. My point here is simply that your code treats towards- and away-facing surfaces the same and I wasn't sure if that was intentional.

I agree with your TODO - is it possible to do this now or is another bit of code in the way?

Done, the division is moved to the Get_Colour methods, this has the impact that GetAverage__ methods will change. Thus far, this isn't a problem, as those methods aren't used. But if you were later to use them, perhaps for things like being able to click on a pixel and getting a velocity, the division won't be done. To be honest, though, if we are going to report absolute velocities, much more will need to be looked at than this. I've had to do a horrible hack for RayDataNormal, as I took out the passing of the DomainStats object from the RayTracer. The only way to fix this, would be to change the method used for coming up with a colour in RayDataNormal to calculating at the end, rather than every voxel site and then mixing in the colours and taking an average - kinda like a RayDataEnhanced. Since RayDataEnhanced runs quicker than RayDataNormal, I expect RayDataNormal will get removed, or replaced by a version of RayDataEnhanced without a lighting?

I think we already have functionality for clicking on a pixel and getting its velocity. No idea if it's currently working. At the moment, I don't see us removing RayDataNormal until we're sure it's totally useless and can't even be modified to be useful.

Is the cc neccessary?

I think it is, as otherwise the static consts and template specialisations would get redefined.

!ClusterRayTracer

This should take pointers, not addresses.

I realise that both references and pointers have duplicated functionaly and both have the same problems. In this case, references have been preferred because a) they're gauranteed not to be NULL,and b)I don't carry out any pointer arithmetic. In general I will use references where possible because there's no need to perform NULL checks.

The normal argument is that using pointers is always explicit - you always know you are doing stuff to a pointer, but I suppose most of us are using decent IDEs now so we can easily see the type of the variable and it's not a problem.

Lots of long parameter names, these will look nicer if you can think of shorter versions (not always easy / possible).

I thought the same, but I can't think of shorter versions. The long ones are vectors, and describing the two ends of the vector is wordy.

lLowerSiteToFirstRayClusterIntersection -> lowSiteToFirstClusterCrossing or something?

Rather than "//(Remember that iRay.mDirection is normalised)", can we just make the function be called "iRay.!GetNormalisedDirection()". The direction itself is not normalised - just expressed by a normalised vector. In this function, the GetDirection function need not be normalised

GetInverseDirection relies on mDirection having been normalised, but is itself very much not normalised! The comment is more there for anyone looking at code who thinks "What's happening here? / how can you do this?". I could of course call the functions GetNormalisedDirectionVector() and GetInverseNormalisedDirectionVector() but this seems excessive.

!UpdateDataForNormalFluidSite should take a pointer to the site data

Same argument as before SiteData_t cannot be null. ClusterRayTracer knows it can't be null (though maybe an assert statement should be used just in case), as so is happy to dereferebce it

Some of these functions could be 'const' - that helps the compiler optimise. citation needed

None of the public functions can be declared const, so IMO there's no need to declare the private/protected functions const, even if they could legitimately be declared const.

It's a good hint to a compiler and it's a good hint to a programmer.

If you used Vectors, and defined the values in your Direction enum, you could get rid of your switch statements (at least one, anyway).

Just realised I could get best of both by moving the enum to the Vector3D class.

There are a lot of static casts that could be simpler (int)-style casts that would give clearer code.

I prefer to use C++ style casts for consitancy (I realise I've left a lot of C style casts in).

We use (type)-style casts in the rest of the code, which is the more important sort of consistency. They're also clearer.

Are all 3 MPI datatype implementations necessary?

I think so - I don't we can template template specialisations.

!ClusterShared

!DoGetBlockIdFrom3DBlockLocation repeats an assumption that's in other places (like the !LatticeData), which should be combined somewhere.

True, but the only similarities are that they're 3D block structures. If you want I could create a common (perhaps templatable) class for this, for both !ClusterShared and !LatticeData.

Whatever is similar about them should be combined. The concept that {x,y,z}->index has x co-ordinate as most significant and z co-ordinate as least should ideally only appear once in our whole code.

!ClusterTraverser

This should keep a pointer

Again, null case not valid and no pointer arithmetic

Would it be useful to have a method to return info about the current location in the cluster? Would that simplify some of the classes that use it?

These are inherited.

I can't see the code, but I meant a method like !GetWallData(int index) instead of doing !GetLocation(int index) and then calling code getting the wall data.

!ClusterNormal

It seems weird that this class even has the !DoGetWallData and its 'set' equivalent. This probably means that the encapsulation needs to be higher up the stack somewhere.

I can't remember whether it was you or Rupert who suggested this approach, but I think it was established that there doesn't seem to be a better way of doing this except #ifdef WALL_DATA

I can't see the code.

!ClusterBuilder

All the logging here is going to create a lot of info, on a low debug level. Does it need to be outputted at all? If so, only on log::Debug please. Rather than using asserts, you could output debug information if a constraint isn't met, and only on Debug log-level. That way, we'll get minimal, only helpful output.

Forgot about that - now changed it to Debug. I can't change the asserts to log outputs because the assert command uses clever macroing to print out the file and line number. I could write a user defined assert as part of the logger which compiles them out when not running on the debug log level. Do you want me to do this?

Logging data should mean you don't need the file and line number. If you can't infer the location from the output message you can grep for the message and see where the output comes from.

Can this class use one of your traversers instead of the normal 3-level loop?

"insertion of copying" should probably be "insertion or copying", right?

I can't see where !GetSiteCoordinatesOfBlock is defined.

You've replaced the original diff with an interdiff. Next time, can you replace with a fresh diff please (or at least leave the original diff)? Ideally of all the non-merge changes since your latest common ancestor with entropy. Ta.

Comment by nick on 28 Sep 2011 13:03 UTC Replying to hywel:

Maybe this is the problem then - using gdb, you can normally find and fix such bugs in the >same amount of time. If you want, we can sit down together some time and talk about how to use > gdb effectively, and I can show you how it works. Some people also recommend getting someone else to read through your code - Rupert found a lattice-Boltzmann bug (which I'd introduced) for me within about 2 minutes.

Amongst reasons to not use assert statements are that they don't diagnose or output what's wrong in the way logging does, they don't tell you anything about the root cause, they clutter the code with statements about what you didn't intend it to do (not necessarily what it's not able to do) and they kill the programme on every error (imagine a production scenario where we'd enable something like assertions to ensure our I/O was successful and that there were no injections in the HemeLB parameters and other stuff - the programme would also die if the image went too far blue which really isn't that big a problem).

I'd be very keen to learn more about GDB but since I'm leaving on Friday, it might not be a good use of your time. I would be surprised if you can find bugs before you even began looking for them using GDB which assertions will do. I would agree that they don't tell you the root cause. I'm afraid I don't understand how logging will be any better. The assertions state the programmer's assumptions ie, I assume the inputs are positive, or I assume that the function will return a positive value, and are even better than comments because they're guaranteed to be true or the programmer will be informed otherwise. Killing the application is desirable - in your case, if the IO fails and there's no code to handle the failure of the IO, anything the application outputs is potentially invalid at this point.

Nevertheless, since I'm leaving Friday and arguing is futile, I'll remove all the assertions.

The GeometryReader was having problems in production runs that we didn't get on smaller local runs, so now has code that is only run on a certain logging level that tests all of our assumptions about our reading of the input file, and displays a human-readable error message when appropriate. You could do something similar here if you don't want to face debugging in gdb.

The diff is prior to reformatting using eclipse, for clarity.

Then it's not ready for review!

What I meant is that the code has been reformatted by eclipse, but this would muddy the diff massively.

XYCoordinates.h

Couldn't this whole class be combined with Vector3D to make a more general template template Vector?

Probably, if you're willing to use a union shudders or auto generated code other than with templates

Really? Couldn't you just have T xs[instead ofT x, y, z;``` You'd have to get rid of the access with a dimension enumeration, instead using a dimension int. I see - switch to an array but then use a function accepting an enum to maintain readability in the cases where the dimension is referred to by letter? I can implement this if you want.

These changes aren't about how they compile, they're about explaining to another programmer what your code does (cf. Literate Programming). This code is a useful little class that isn't specifically about visualisation, therefore it belongs in the util namespace. Fair point. If you want to go with above, the class will soon vanish. Otherwise I'll do this.

HSLToRGBConverter

The method seems quite over-engineered, and there are few comments. It's not clear to me why we can't use floats, nor why we have to specify how many bytes each int is. The wiki page you link to makes the conversion look much simpler.

Floats can be used, but unsigned integers have been preferred for more even precision. Sizes have been specified since C++ gives no guarantees about the size of types.

All the maths carried out in the code matches that on the wiki page

No doubt, but if you choose to change it this much, you're going to need to add a lot more comments to explain what you're doing. I'll get onto this.

!RayDataEnhanced

Do we definitely want to do the "fabs(lDotProduct)"? Have you tried versions where the back wall of a vessel isn't displayed?

Looking down the tube is no longer possible with the diffTest, and the end of the tube doesn't show up. Obviously the rest of the tube looks fine, but assymetric blood vessels won't look as 3D. Is an extra if statement really going to be cheaper than fabs and a multiplication? I realise fabs isn't quite as efficient as changing the sign bit as it does some checks for special values, but I presume the branch prediction means that the hit is tiny.

Eh? Why is looking down the tube impossible? Re: if / fabs, it depends on ratio between how many times each is called. My point here is simply that your code treats towards- and away-facing surfaces the same and I wasn't sure if that was intentional.

When you "look down the tube", you only see back walls. Therefore everything becomes maximum lightness. Ok - you're still looking at a projection of data within the tube, but it's impossible to perceive that this is the case.

I think we already have functionality for clicking on a pixel and getting its velocity. No idea if it's currently working. At the moment, I don't see us removing RayDataNormal until we're sure it's totally useless and can't even be modified to be useful. There is code for density and stress - don't think it's implemented in VizClient.

The normal argument is that using pointers is always explicit - you always know you are doing stuff to a pointer, but I suppose most of us are using decent IDEs now so we can easily see the type of the variable and it's not a problem. I don't think I understand. Do you mean that using a pointer means you're constantly reminded by the -> that the object may not actually exist or exists outside the scope of the function (unless someone takes an address of an object local to the function)?

Lots of long parameter names, these will look nicer if you can think of shorter versions (not always easy / possible).

I thought the same, but I can't think of shorter versions. The long ones are vectors, and describing the two ends of the vector is wordy.

lLowerSiteToFirstRayClusterIntersection -> lowSiteToFirstClusterCrossing or something? I'd be worried that it's not clear what a "ClusterCrossing" is or what a "lowSite" is. Of course lower site is already confusing - could be renamed to SiteOfLowestXYAndZCoordinates but at least lower site is used elsewhere in the raytracer.

Some of these functions could be 'const' - that helps the compiler optimise. citation needed

None of the public functions can be declared const, so IMO there's no need to declare the private/protected functions const, even if they could legitimately be declared const.

It's a good hint to a compiler and it's a good hint to a programmer. Fortunately, when I declared a function const which altered a member variable, the compiler knew better than to believe my hint. Isn't this "feature" of C++ because const isn't a separate part of the language but part of the static type system and there are some cases where the compiler can't tell if a function really is "const", just certain cases it clearly isn't "const"?

Nevertheless, if you really want, I can go through and add const to the relevant private functions.

We use (type)-style casts in the rest of the code, which is the more important sort of consistency. They're also clearer. I'm not sure what you mean by clearer, and I'm surprised you use the backwards compatibility of C++ over it's native casts, but I'll replace them for consistency.

!ClusterShared

!DoGetBlockIdFrom3DBlockLocation repeats an assumption that's in other places (like the !LatticeData), which should be combined somewhere.

True, but the only similarities are that they're 3D block structures. If you want I could create a common (perhaps templatable) class for this, for both !ClusterShared and !LatticeData.

Whatever is similar about them should be combined. The concept that {x,y,z}->index has x co-ordinate as most significant and z co-ordinate as least should ideally only appear once in our whole code.

Do you want me to create a new templated type containing a T*\ with some nice methods and using this in all cases? If so, what do you want it called, wand where do you want it placed?

!ClusterTraverser

This should keep a pointer

Again, null case not valid and no pointer arithmetic

Would it be useful to have a method to return info about the current location in the cluster? Would that simplify some of the classes that use it?

These are inherited.

I can't see the code, but I meant a method like !GetWallData(int index) instead of doing !GetLocation(int index) and then calling code getting the wall data. I think I understand. The ClusterTraverser works on a block level. The ClusterTraverser could possibly return a inherited version of SiteTraverser which could be used in this manner. I think this would be less simple than simply obtaining the block index and using this as an argument to TraverseVoxel.

You've replaced the original diff with an interdiff. Next time, can you replace with a fresh diff please (or at least leave the original diff)? Ideally of all the non-merge changes since your latest common ancestor with entropy. Ta. I never uploaded the original diff as it was polluted by all the merges. I assumed that last time you did the code review, you simply looked at the files in my repository?

Comment by hywel on 29 Sep 2011 11:40 UTC Replying to nick:

Replying to hywel:

Maybe this is the problem then - using gdb, you can normally find and fix such bugs in the >same amount of time. If you want, we can sit down together some time and talk about how to use > gdb effectively, and I can show you how it works. Some people also recommend getting someone else to read through your code - Rupert found a lattice-Boltzmann bug (which I'd introduced) for me within about 2 minutes.

Amongst reasons to not use assert statements are that they don't diagnose or output what's wrong in the way logging does, they don't tell you anything about the root cause, they clutter the code with statements about what you didn't intend it to do (not necessarily what it's not able to do) and they kill the programme on every error (imagine a production scenario where we'd enable something like assertions to ensure our I/O was successful and that there were no injections in the HemeLB parameters and other stuff - the programme would also die if the image went too far blue which really isn't that big a problem).

I'd be very keen to learn more about GDB but since I'm leaving on Friday, it might not be a good use of your time. I would be surprised if you can find bugs before you even began looking for them using GDB which assertions will do. I would agree that they don't tell you the root cause. I'm afraid I don't understand how logging will be any better. The assertions state the programmer's assumptions ie, I assume the inputs are positive, or I assume that the function will return a positive value, and are even better than comments because they're guaranteed to be true or the programmer will be informed otherwise. Killing the application is desirable - in your case, if the IO fails and there's no code to handle the failure of the IO, anything the application outputs is potentially invalid at this point.

You're probably right that it's too late to really get into GDB. I suppose my point here is that some programmers' assumptions are inviolable (e.g. the IO case I mentioned) and some aren't (e.g. hue going above 360.0) and assertions don't let you distinguish between cases. Using the logger stuff we have lets you specify the output a user gets and gives you the choice of killing the programme or not.

Nevertheless, since I'm leaving Friday and arguing is futile, I'll remove all the assertions.

Arguing certainly isn't futile; there's not been a firm decision made about assertions yet. I remain unconvinced and favour logging, Miguel is in favour of assertions and we're going to meet and make a decision sometime soon. If you don't want to replace your assertions with logging at the right level then it's probably better to leave them than remove them altogether.

The GeometryReader was having problems in production runs that we didn't get on smaller local runs, so now has code that is only run on a certain logging level that tests all of our assumptions about our reading of the input file, and displays a human-readable error message when appropriate. You could do something similar here if you don't want to face debugging in gdb.

The diff is prior to reformatting using eclipse, for clarity.

Then it's not ready for review!

What I meant is that the code has been reformatted by eclipse, but this would muddy the diff massively.

Ah OK. If you run diff with -bwB it should ignore most white-space changes.

XYCoordinates.h

Couldn't this whole class be combined with Vector3D to make a more general template template Vector?

Probably, if you're willing to use a union shudders or auto generated code other than with templates

Really? Couldn't you just have T xs[instead ofT x, y, z;``` You'd have to get rid of the access with a dimension enumeration, instead using a dimension int. I see - switch to an array but then use a function accepting an enum to maintain readability in the cases where the dimension is referred to by letter? I can implement this if you want.

You can use an enum or just use an index integer.

These changes aren't about how they compile, they're about explaining to another programmer what your code does (cf. Literate Programming). This code is a useful little class that isn't specifically about visualisation, therefore it belongs in the util namespace. Fair point. If you want to go with above, the class will soon vanish. Otherwise I'll do this.

I'd prefer the above one.

HSLToRGBConverter

The method seems quite over-engineered, and there are few comments. It's not clear to me why we can't use floats, nor why we have to specify how many bytes each int is. The wiki page you link to makes the conversion look much simpler.

Floats can be used, but unsigned integers have been preferred for more even precision. Sizes have been specified since C++ gives no guarantees about the size of types.

All the maths carried out in the code matches that on the wiki page

No doubt, but if you choose to change it this much, you're going to need to add a lot more comments to explain what you're doing. I'll get onto this.

!RayDataEnhanced

Do we definitely want to do the "fabs(lDotProduct)"? Have you tried versions where the back wall of a vessel isn't displayed?

Looking down the tube is no longer possible with the diffTest, and the end of the tube doesn't show up. Obviously the rest of the tube looks fine, but assymetric blood vessels won't look as 3D. Is an extra if statement really going to be cheaper than fabs and a multiplication? I realise fabs isn't quite as efficient as changing the sign bit as it does some checks for special values, but I presume the branch prediction means that the hit is tiny.

Eh? Why is looking down the tube impossible? Re: if / fabs, it depends on ratio between how many times each is called. My point here is simply that your code treats towards- and away-facing surfaces the same and I wasn't sure if that was intentional.

When you "look down the tube", you only see back walls. Therefore everything becomes maximum lightness. Ok - you're still looking at a projection of data within the tube, but it's impossible to perceive that this is the case.

I think we already have functionality for clicking on a pixel and getting its velocity. No idea if it's currently working. At the moment, I don't see us removing RayDataNormal until we're sure it's totally useless and can't even be modified to be useful. There is code for density and stress - don't think it's implemented in VizClient.

The normal argument is that using pointers is always explicit - you always know you are doing stuff to a pointer, but I suppose most of us are using decent IDEs now so we can easily see the type of the variable and it's not a problem. I don't think I understand. Do you mean that using a pointer means you're constantly reminded by the -> that the object may not actually exist or exists outside the scope of the function (unless someone takes an address of an object local to the function)?

It's more that using -> makes it obvious to the programmer that they are dealing with access to a non-locally-owned variable and changes could affect other bits of code. Using . for access to what is effectively a pointer doesn't cue the programmer in the same way (unless they're using an editor that tells them the variable is a reference rather than a local / member). But I think the pluses of references (non-NULL, no reassignment) probably outweigh the negatives.

Lots of long parameter names, these will look nicer if you can think of shorter versions (not always easy / possible).

I thought the same, but I can't think of shorter versions. The long ones are vectors, and describing the two ends of the vector is wordy.

lLowerSiteToFirstRayClusterIntersection -> lowSiteToFirstClusterCrossing or something? I'd be worried that it's not clear what a "ClusterCrossing" is or what a "lowSite" is. Of course lower site is already confusing - could be renamed to SiteOfLowestXYAndZCoordinates but at least lower site is used elsewhere in the raytracer.

The name doesn't need to say everything, it really needs to be concise and sufficiently accurate. If you need to explain in greater detail what your storing, use a comment where you declare the var.

Some of these functions could be 'const' - that helps the compiler optimise. citation needed

None of the public functions can be declared const, so IMO there's no need to declare the private/protected functions const, even if they could legitimately be declared const.

It's a good hint to a compiler and it's a good hint to a programmer. Fortunately, when I declared a function const which altered a member variable, the compiler knew better than to believe my hint. Isn't this "feature" of C++ because const isn't a separate part of the language but part of the static type system and there are some cases where the compiler can't tell if a function really is "const", just certain cases it clearly isn't "const"?

Nevertheless, if you really want, I can go through and add const to the relevant private functions.

Const tells me something important about the function without having to look at the implementation. The compiler knows when it reads the .cc and you've lied about the const-ness, I believe it also can effect optimisations being made or not when using functions with pointers (const on one of the pointers will imply that the pointers you've passed don't overlap or something). This probably doesn't affect your case, but the readability point remains (which for a long-maintained code is nearly as important). See http://www.parashift.com/c++-faq-lite/const-correctness.html#faq-18.3 if unconvinced.

We use (type)-style casts in the rest of the code, which is the more important sort of consistency. They're also clearer. I'm not sure what you mean by clearer, and I'm surprised you use the backwards compatibility of C++ over it's native casts, but I'll replace them for consistency.

I don't think there's any difference in the generated assembly but concise code is always more readable (especially if it means you can fit one statement on one line).

!ClusterShared

!DoGetBlockIdFrom3DBlockLocation repeats an assumption that's in other places (like the !LatticeData), which should be combined somewhere.

True, but the only similarities are that they're 3D block structures. If you want I could create a common (perhaps templatable) class for this, for both !ClusterShared and !LatticeData.

Whatever is similar about them should be combined. The concept that {x,y,z}->index has x co-ordinate as most significant and z co-ordinate as least should ideally only appear once in our whole code.

Do you want me to create a new templated type containing a T*\ with some nice methods and using this in all cases? If so, what do you want it called, wand where do you want it placed?

That sounds like a good solution. It sounds like it belongs in utils, maybe called template<typename LatticeType> Lattice3D; .

!ClusterTraverser

This should keep a pointer

Again, null case not valid and no pointer arithmetic

Would it be useful to have a method to return info about the current location in the cluster? Would that simplify some of the classes that use it?

These are inherited.

I can't see the code, but I meant a method like !GetWallData(int index) instead of doing !GetLocation(int index) and then calling code getting the wall data. I think I understand. The ClusterTraverser works on a block level. The ClusterTraverser could possibly return a inherited version of SiteTraverser which could be used in this manner. I think this would be less simple than simply obtaining the block index and using this as an argument to TraverseVoxel.

If you think it'd be less simple, no worries.

You've replaced the original diff with an interdiff. Next time, can you replace with a fresh diff please (or at least leave the original diff)? Ideally of all the non-merge changes since your latest common ancestor with entropy. Ta. I never uploaded the original diff as it was polluted by all the merges. I assumed that last time you did the code review, you simply looked at the files in my repository?

Yup, you're totally right. Sorry about that.

Modified by hywel on 13 Oct 2011 11:40 UTC

Modified by jamespjh on 3 Nov 2011 14:44 UTC

Comment by hywel on 10 Nov 2011 08:50 UTC I merged this, fixed some stuff that didn't work after the merged and checked all tests pass. I'm treating the merge as a code review (in addition to the one on this ticket). Pushed as changeset 918a0686e7bd.

hemelb-codes / hemelb

Nick's work. #7