Differences in cpymad `table.twiss` and madx `table(twiss, X, X)` when running `use`

JoschD commented 3 years ago

I see differences when getting values from table.twiss.dframe().loc[X, X] vs running table(twiss, X, X) in madx (e.g. via input()) if there is a use in between the last twiss and getting the values.

These differences can be quite substantial, but I have not managed to create a minimum working example with large differences. Here is one that shows the problem, but the differences are not as big as I can see in some of my scripts:

#%%
# ------- Setup --------
from cpymad.madx import Madx
madx =  Madx()
mvars = madx.globals  # shorthand

madx.call("/afs/cern.ch/eng/lhc/optics/runII/2018/toolkit/macro.madx")
madx.call("/afs/cern.ch/eng/lhc/optics/runII/2018/lhc_as-built.seq")
madx.call("/afs/cern.ch/eng/lhc/optics/runII/2018/PROTON/opticsfile.22_ctpps2")
mvars.on_sep1 = 0.  # creates more obvious differece

madx.beam(
    sequence='lhcb1', bv=1,
    energy="NRJ", particle="proton", npart=1e10,
    kbunch=0, ex=7.29767146889e-09, ey=7.29767146889e-09
)
madx.use(sequence='lhcb1')

#%%
# ------ Actual Test ------
madx.twiss(sequence="lhcb1")
madx.use(sequence="lhcb1")  # This seems to be the culprit
# madx.twiss(sequence="lhcb1")  # Also fixes the problem

twiss_df = madx.table.twiss.dframe()
mvars["from_dframe"] = twiss_df.loc["ip1", "x"]
madx.input("from_table = table(twiss, ip1, x);")

print(mvars["from_dframe"])
print(mvars["from_table"])

Output:

-9.04026208843632e-11
-3.822658682068302e-11

upon commenting out the madx.use() or doing another madx.twiss() afterwards :

-3.822658682068302e-11
-3.822658682068302e-11

ydutheil commented 3 years ago

Hi,

I remember facing something similar and since then I never use the madx.table.twiss because I am unsure what remains there. I think the point is that USE creates the sequence anew, and seems for some reason to change the row_names of the table.

What do you think using twiss_df = madx.sequence['lhcb1'].twiss_table.dframe() instead, this is what I usually do and in your example it raises an exception on the twiss table being invalid.

below a not so minimal example

from cpymad.madx import Madx

mqk, mql, length = 1.0, 0.1, 5.3

madx = Madx()
madx.beam(particle='proton', pc=1000)

madx.command.sequence.clone('S1', l='{:10.6e}'.format(length))
madx.elements.marker.clone('s1start', at=0)
madx.command.endsequence()

madx.command.quadrupole.clone('QF', l=mql, k1=mqk)
madx.command.quadrupole.clone('QD', l=mql, k1=-mqk)
madx.command.rbend.clone('dip', l=0.1, angle=10e-3)

madx.command.seqedit(sequence='S1')
madx.command.install(element='QF1', class_='QF', at='{:10.6e}'.format(mql/2))
madx.command.install(element='B1', class_='dip', at=mql+0.1/2)
madx.command.install(element='QD1', class_='QD', at='{:10.6e}'.format(length/2-mql/2))
madx.command.endedit()

madx.use(sequence='s1')
madx.twiss()

print(madx.table.twiss.row_names())
print(madx.table.twiss.name)
print(madx.sequence['s1'].twiss_table.row_names())

madx.options(debug=True)
madx.use(sequence='s1')
print(madx.table.twiss.row_names())
print(madx.table.twiss.name)
print(madx.sequence['s1'].twiss_table.row_names())

JoschD commented 3 years ago

Hey, thank you for your answer. That looks like a safer way of doing things indeed!

But, workarounds aside, this still looks like a bug as at the same state of the madx-instance table(twiss, X, X) works while none of the other two options (table.twiss or sequence['lhcb1'].twiss_table) result in correct values (at least the second one let's you know about it, though...)

coldfix commented 3 years ago

Hi,

interesting. I've reproduced using a smaller example (since I don't have or want to install afs here).

from cpymad.madx import Madx

madx = Madx()
madx.input("""
beam;

mqf.k1 =  0.3037241107;
mqd.k1 = -0.3037241107;

fodo: sequence, l=10, refer=entry;
mqf: quadrupole, at=0, l=1, k1:=mqf.k1;
dff: drift,      at=1, l=4;
mqd: quadrupole, at=5, l=1, k1:=mqd.k1;
dfd: drift,      at=6, l=4;
endsequence;

use, sequence=fodo;
twiss, sequence=fodo, x=0.1;
""")

before = madx.table.twiss.dframe()
madx.use(sequence="fodo")
after = madx.table.twiss.dframe()

print("row names before:", before.index.tolist())
print("row names after: ", after.index.tolist())

print("name column before:", before.name.tolist())
print("name column after: ", after.name.tolist())

print("x column before:", before.x.tolist())
print("x column after: ", after.x.tolist())

madx.input("from_table = table(twiss, mqf, x);")
print(madx.globals['from_table'])

From this we can see that the table values are still correct, just the row names get messed up by the use call. There is also a comment in the MAD-X use_sequ function in src/mad_seq.c: calling use screws up any twiss table calculated..

I believe the reason for that is that the table row names are assigned as raw pointers to the current node's name when adding a row, see src/mad_table.c:

void
augment_count(const char* table) /* increase table occ. by 1, fill missing */
{
  ...
  if (t->node_nm != NULL)
  {
    t->node_nm->p[t->curr] = current_node->name;
    t->node_nm->curr = t->curr;
  }
  ...
}

Now, after calling use, the sequence's nodes may be invalidated, and hence these pointers are not valid anymore. We could maybe even get segmentation faults from trying to read these values. Hence, using node_nm is not safe after use.

The reason that MAD-X's table(twiss, mqf, x) still works is that it doesn't use node_nm to look up the row, but seems to try all string columns instead, see function table_row(table, name) in src/mad_table.c.

The approach MAD-X is taking (i.e. trying to find the element name in arbitrary string columns) is not a good idea IMO, and not even applicable here (since we have to decide on one column to be used for the dataframe's index).

Hence, this is probably something to be added to "known issues" that it is unsafe to use tables after use.

A possible workaround would be:

twiss_table = madx.table.twiss
twiss_df = pd.DataFrame(
    twiss_table.copy(),
    index=np.char.partition(twiss_table.name, ":")[:, 0])

We could integrate using this workaround by adding an index argument to the Table.dframe() method that allows specifying a column name to be used as index instead of node_nm. What do you think?

JoschD commented 3 years ago

Thank you very much for looking that deep into it (and for providing your take on the smallest example), @coldfix . I like your workaround idea, if the data is still in place? It didn't sound too promising, if the nodes are rebuild. Can we trust any data in the table.twiss after use? I also forwarded this issue to @tpersson , in case he sees some obvious problem from the mad-x side with this.

tpersson commented 3 years ago

Hi, I think this part of connecting the table to the nodes is used for the plotting and would be rather difficult to change since still, quite a few people are using the plotting. I think @coldfix solution works fine and I would recommend going for that in cpymad.

coldfix commented 3 years ago

Can we trust any data in the table.twiss after use?

As far as I see, the data in the tables are not touched by USE, it's just that the sequence nodes get recreated which is a problem only for the node_nm pointer that is used by cpymad for the index.

JoschD commented 3 years ago

Thank you so much! This is an amazing package and I am very thankful that you are still maintaining it!

hibtc / cpymad

Differences in cpymad `table.twiss` and madx `table(twiss, X, X)` when running `use` #93