davidrmiller / neural2d

Neural net optimized for 2D image data
MIT License
142 stars 69 forks source link

Refactor(xySize, dxySize): Refactor xySize and dxySize. #22

Closed jeandudey closed 8 years ago

jeandudey commented 9 years ago

I reemplazed xySize and dxySize with these new structs:

Both structs use C++ meta-programming features, the Vector3 struct inherits from Vector2.

I refactored too the ImageReader public function "getData()" to a less typed name "data" and changed the parameters from camelCase to camel_case.

In the next commits and pull request i will be doing a series of refactorings, the next will probably be the unitTest module, in these commits i will be using this commit template and using the Google C++ Style guide (which i strongly suggest to use).

Jean Pierre Dudey <jeandudey _AT_ hotmail _DOT_ com>

davidrmiller commented 9 years ago

Greetings @jeandudey, and thanks for this contribution. I'll investigate and get back to you with some questions about these changes.

davidrmiller commented 9 years ago

dataContainer

I'm ok with renaming dataContainer to data_container in class ImageReader. Either way works for me.

colorChannel

That's a good idea to rename colorChannel to channel, because it's possible to use those channels for any kind of information that is not strictly color data.

getData()

Can you describe the reason for changing the member function name in class ImageReader from getData() to just data()? Personally, I like it when functions are named as verbs to indicate what the function does. The name "getData()" tells me that it will "get data." It could also be named extractData() or some other verb. However, the name data() is just a noun and does not indicate what the function does with data. To me, it seems like this change makes it harder to read the code.

dxySize

For the xySize and dxySize types, I would like to present an argument for keeping their existing names. These types have a very specific and narrow meaning -- they are specifically for representing neuron locations in a single layer of the neural net topology. They are never used in any more general manner.

A single layer consists of one or more planes of neurons, so a neuron location needs three coordinates to specify which plane and the X,Y location in the plane. The planes-dimension is not like the X and Y dimensions -- it corresponds to the number of convolution kernels being trained. It's like in physics where locations in spacetime have coordinates (t,x,y,z) and t (time) is a special dimension not quite like the others.

I have been calling that third dimension "depth," but it could also be called "plane" or something like that. But I'm not comfortable calling that dimension "z" because that dimension is not just another dimension like x and y. That's related to why it's called neural2d instead of neural3d :-)

(Class xySize is not strictly necessary; we could have used dxySize everywhere xySize is used. It's a separate type so that its name can convey the meaning that it's a neuron location within a given plane.)

Personally, I like type names to be self-documenting when possible. When I see the type name "dxySize," it reminds me that it is specifically the location of a neuron in terms of depth and x, y. But if I see the type name "Utils::Vector2u32," it doesn't speak to me. It makes it look like a general purpose vector, which it isn't. I like to generalize abstractions when possible, but these types are not meant to be used for anything else.

However, I am not opposed to changing their names, as long as their names continue to document that they are used only for holding an X,Y location and a depth. For example, depthXY_t, or DXYlocation, etc.

Also, I am not opposed to defining dxySize and xySize as templated structs where one inherits from the other if that solves some particular problem (does it?). If there is no particular problem that it solves, then I suggest we leave those structs as simple structs. If there is a problem that needs to be solved by using templates and inheritance, then I suggest that we declare those types that way in neural2d.h and not add another header file just for those types.

Google style guide

The Google style guide is a perfectly good style guide if you work for Google and everybody follows the guide. But neural2d is not a Google project, and I don't see a need to change neural2d to conform to Google's guide. If we did, we would have to change every line in the entire project and stop using some C++ features that are perfectly legal and portable to use. Neural2d is for educational purposes, and I think it is educational to use C++ features in the way they were intended to be used. Many of my objections to Google's style guide are the same as the ones documented in this article.

Your comments and insights are welcomed.

jeandudey commented 9 years ago

getData() I rename getData() to simply data() because only You get "data" of the "ImageReader" class and you can not put data in the class "ImageReader" to better understand what I explain, imagine you have a struct called "Image" and all data resides in a field of type "T" called data.

This struct may look something like this:

struct Image {
     T data;
}

And then you would get data like this:

Image img;
use_the_data(img.data);

But if you're planning in the future it would be more appropiate to use get and set methods.

dxySize I'm ok with your explanation, i changed the name to a more general name "Vector3", but I did not know which it was used to store beyond the dimensions of the images, my bad :(

About the inheritance, Vector3 inherits Vector2 because it's logic that "Vector3" has a parent ("Vector2") and it has a children ("Vector4"), also it simplifies the code by calling the Vector2 constructor that sets all fields to 0 and Vector3 only sets z to 0.

It could be also removed the meta-programming and use uint32 by default. About naming Vector2 could be typedefed to ImageSize or simply Size and Vector3 to DXYLocation or anything else and change z to depth or plane.

Also i used template meta-programming if in the future you planned to use more precise types like float and double.

Google Style Guide The only thing that i like of it is the syntax used for C++ that is quite simpler and cleaner.

Thanks for the feedback.

BTW: Sorry for my bad english, i don't speak it natively.

davidrmiller commented 9 years ago

No need to apologize, your English is 10000 times better than my Spanish :-)

getData()

Thanks for those examples. I understand the following example:

Image img;
use_the_data(img.data);

However, I don't think that example applies to class ImageReader. Class ImageReader only contains a single function that moves image information from one external container (the image file) to another external container (a vector). There is no data inside class ImageReader, so that class will never have accessors.

Perhaps the function name is confusing because "getData()" sounds like an accessor, but it isn't -- its purpose is to parse and move image data between external objects, and report the image dimensions.

Would it be an improvement to rename getData() to something like readData(), extractImageData(), moveImageData(), parseExtractAndSaveData(), or something like that?

jeandudey commented 9 years ago

You're right, getData sounds confusing in the contex that it is used, it would be appropiate to change the name of the method. Another name that it can have for less typing is: read_file()

davidrmiller commented 9 years ago

getData()

I like that suggestion -- read_file() is a better name for that function than getData().

Code formatting

On the matter of code formatting, there is no style that will please everybody, so the best we can hope for is to adopt a style and use it consistently.

So far, the neural2d source code uses a K&R-like brace style, with four-character indentation using spaces, not tabs. That general style will be familiar to most new programmers because it is similar to the style used in the publications by K&R and Stroustroup (the creators of C and C++), and in the C and C++ standards documents produced by ANSI and ISO/IEC.

For a small project like neural2d, I personally don't see a need to be overly strict about formatting, so I'm happy with some variation in the formatting as long as it looks more or less consistent with what you would find in a K&R or Stroustrup book or in the C or C++ standard documents.

davidrmiller commented 8 years ago

Just cleaning up loose ends. I'll close this particular pull request, but feel free to resubmit a new one. There is useful discussion in this thread; feel free to continue the discussion here or in a new Issue or pull request. Many thanks for the participation and discussion.