atcollab / at

Accelerator Toolbox
Apache License 2.0
48 stars 31 forks source link

get_unit32_index , get_bool_index names? #605

Closed swhite2401 closed 1 year ago

swhite2401 commented 1 year ago

Dear all,

Recently get_unit32_index , get_bool_index were added to replace the old functions get_cells and get_refpts (that are now in the deprecated module).

Following a discussion we had yesterday with @ThorstenHellert, very good arguments were raised to support the use of refpts as AT naming convention for element indexes. This is an historical AT feature and it was maybe not such a good idea to change it to index.

I am partially guilty because I approved this PR which at the time seemed like a good idea. However, I feel like others may have missed this a could feel otherwise.

This is a recent addition so it is still ok to change but if needed we have to do it now, please advise.

MJGaughran commented 1 year ago

Do you have any record of this conversation, or a summary of the arguments? I can't find any Issues or Discussions about this.

And who else was part of this discussion?

swhite2401 commented 1 year ago

@MJGaughran this conversation took place during a collaboration meeting with DESY/LBNL colleagues so it does not appear here. I am just bringing this up to you as a users feedback

MJGaughran commented 1 year ago

@swhite2401 A reasonable number of people saw the previous PR, and approved it, because they thought it was a worthwhile change. It isn't clear what the problems are as we have no record of your conversation or of their arguments. Even a brief summary from your own memory would help, although an email from the user would be the most useful.

My own opinions are probably not too important, but I want to get this information so the others you have linked are able to properly engage with this issue.

oscarxblanco commented 1 year ago

Dear all, the two names seem valid (I believe it is get_uint ... and not get_unit..., seems to be just a typo).

get_uint32_index looks clearer when talking about variables in an array programing language (among which numpy in python and matlab), quite evident the max. indexing.

get_refpts looks closer to the accelerator diagnostics where you could only get info about the beam in particular locations, and in simulations at most at every element. To get something inbetween elements you need to do some sort of split or extrapolation.

Depending on the context one could prefer one or the other. However, is there any reason to explicitely include the memory allocation type on the function name ? I have come across with the Nslices attribute of insertion devices that explicitely needs to be int (or numpy.uint8). I guess this was the reason why get_uint32... was chosen.

o

swhite2401 commented 1 year ago

Hi @oscarxblanco,

I agree the 2 names are valid and *_index would be the natural choice for the programmer. However, initially a different choice, namely *_refpts, was made in AT. I was therefore wondering whether this change of definition may be a source of confusion for historical AT users that are used to *_refpts (as it seemed to be the case in this meeting I mentioned above).

If this is not an issue, very good we can leave everything as is!

But as pointed out by @MJGaughran this has already been approved so I am really not pushing at all for this unless there are clear concerns.

lfarv commented 1 year ago

Maybe I can try to make things more clear, since it looks there is some confusion between "refpts" and "index"

A "refpts" variable is an element "selector", it's not always an "index". It can be

On the other hand the following functions generate indexes, which can be used as "refpts", but also for other purposes:

These names were introduced because it was not obvious that get_cells would return booleans and get_refpts integers. These name were just inherited from Matlab. To answer @oscarxblanco, the data type is interesting to know because while you can use both as "refpts", the operations you can perform are different: logical "or" for booleans, concatenation for integers.

For me, the outcome of the discussion mentioned above was:

I am still in favour of this, because meaningful names make understanding code much easier. Fans of get_refpts may still use it (though if there is choice, boolean indexing is more efficient, Matlab says!). For users moving from Matlab to python, it's up to them: keeping their old habits, or change.

swhite2401 commented 1 year ago

Thanks @lfarv, very clear.

For me: get_uint32_index, get_bool_index or get_uint32_refpts, get_bool_refpts would be equally correct and self descriptive for AT users. The latter being more coherent with naming conventions used in the past (get_refpts).

simoneliuzzo commented 1 year ago

It makes no real difference for me. I always end up using get_refpts! *

How could we add in the AT documentation the meanings of index, refpts or mask to make it clear for users? May be the same that was done for Collective effects by @lcarver ?

swhite2401 commented 1 year ago

@simoneliuzzo get_uint32_index is the same as get_refpts (returns an array of indexes) and get_bool_index is the same as get_cells (returns a mask). In fact the old deprecated functions are calling the new ones if you look at the source.

Documentation on the definition of refpts would be great indeed! Maybe extent to general lattice operations (indexing, manipulations, loading, saving, etc....)

lfarv commented 1 year ago

@simoneliuzzo:

How could we add in the AT documentation the meanings of index, refpts or mask to make it clear for users

There is already some explanations here, it could be improved by adding examples. The documentation of each function having a "refpts" argument should have a link to that page (only partially done at the moment).

I end up reverting to get_refpts or get_cells

if you want to use integers, the "non-deprecated" function is get_uint32_index. But keep in mind that you don't always need to go through this: all optical functions (ex. linopt6) accept any kind of refpts mentioned here (element classes, patterns, End…). So getting an index (integer or boolean) is useful if:

lfarv commented 1 year ago

@swhite2401:

get_uint32_index or get_uint32_refpts

Ah, you did not mention get_uint32_refpts yet! I thought the idea was to remove the new names and stick to the historical ones… Well, it's anyway a new name be introduced, so I would keep "index" as more explicit": an "index" may be used as a "refpts", among other things… Are there strong arguments for "refpts", apart from the fact that it looks more like the old name? And why not "cells"? But if there is a unanimous push for get_uint32_refpts

swhite2401 commented 1 year ago

@lfarv , sorry for the confusion but this was the whole point of this issue... I obviously did not explain myself correctly!

I would also keep index. It is in fact more explicit and new users will understand immediately what it means...but since the question was raised I brought it up here just to be sure this change has no gone unnoticed

Looks like there is no issue from the comments above so perfect!