Open jameswilburlewis opened 11 months ago
As I said during the meeting, I assumed that it was a default behavior. I had to use deepcopy at lease here https://github.com/spedas/pyspedas/blob/78f498b9e4e349edb1013695dca752fce4e4947e/pyspedas/themis/spacecraft/fields/fit.py#L177
from copy import deepcopy
d = deepcopy(get_data(tvar))
There could be other places in the code and in other peoples' projects. I recommend to include a flag for deepcopy and during the major release have a note that the default behavior is changing. But it might brake things and it will definitely affect the performance.
Another problem with store_data is that it crashes when it is used with two tplot variables and one of them does not exist. IDL handles this correctly.
As mentioned at the meeting yesterday, we'll need to identify any code (e.g. mms utilities) that relies on the current behavior, and change those get_data calls to get direct references and not copies. This is a somewhat tricky problem...I can think of 3 approaches we might be able to use: 1) Static analysis (with help from ChatGPT). 2) Add some hooks to store_data/get_data/del_data to detect modifications to the stored data. 3) Ask Eric if we're stumped after the first two methods...
As things stand now, get_data returns a reference, rather than a copy, of the time and data arrays. So any modifications to the returned arrays will affect the stored values in the tplot variable, which is not at all the desired behavior.
There may be cases where the user knows the data won't be modified, and prefers a reference rather than a copy, to avoid the memory and performance penalty of making the copies. We can add a keyword to specify references rather than copies.