Open WenzDaniel opened 11 months ago
I started looking into this. It looks straightforward, and I think I just now understood what you meant by typecasting. Just to confirm;
a = st.get_metadata(run_id, target)['lineage']
b = st.key_for(run_id, target).lineage
strax.utils.compare_dict(a, b)
returns the same information but one of them returns them in a list while the other returns them in a tuple.
Then I need to check how deep the lists and tuples are nested to convert them.
Should I make a PR branching from this commit 5f9051b ? The version on the master is different
I have refactored the code to check lineages in case the data is not stored. I also wrote the following utility function for converting tuples at varying-depth nested objects.
import copy
def convert_tuple_to_list(init_func_input):
func_input = copy.deepcopy(init_func_input)
# if it is a tuple convert it and reiterate
if isinstance(func_input, tuple):
_func_input = list(func_input)
return convert_tuple_to_list(_func_input)
# if it is a list, go over all the elements until all tuples are lists
elif isinstance(func_input, list):
new_func_inp = []
# check each element
for i in func_input:
new_func_inp.append(convert_tuple_to_list(i))
# iterates until everything is all depths are exhausted
return new_func_inp
# if it is a dict iterate
elif isinstance(func_input, dict):
for k,v in func_input.items():
func_input[k] = convert_tuple_to_list(v)
return func_input
else:
# if not a container, return. i.e. int, float, bytes, str etc.
return func_input
I tested it and seems to be working as intended.
However, I think the lineage comparison might be a little misleading as the lineages of different runs do not differ if they are from the same context, any runid as long as the data type is the same will result in the same lineage e.g.;
st.key_for("000000", "event_basics").lineage == st.key_for("053675", "event_basics").lineage
is always True. So by comparing the lineages (after straightening out the tuples and lists, if either of the compared parts is different) we can only say that they are created with the same context and they are looking at the same datatype. Is this the intended behavior?
So by comparing the lineages (after straightening out the tuples and lists, if either of the compared parts is different) we can only say that they are created with the same context and they are looking at the same datatype. Is this the intended behavior?
No then you did not understand my use case. In the current implementation we can only compare the lineage of the current context with the metadata of some stored data, if the metadata of the current context is also already stored somewhere. Because you load the metadata of the context via self.get_meta
which only works if the data is stored somewhere. However, I think the nominal use case is rather: "I cannot load any data with my current context, why does my lineage differ with this piece of data stored in my directory."
You can just branch of the master for your changes.
We should add a fall back method to https://github.com/AxFoundation/strax/blob/5f9051bcb4fb4a8d46e5fd3da21e522959735352/strax/context.py#L1682 which only compares the lineages in case the current data-type is not stored. Currently, we are loading the metadata via:
We should as fall back in case data is not stored:
One needs to only cast all tuples into lists or vise-versa or deactivate the type checking as otherwise one cannot compare to metadata files loaded from disk.__