gecos-lab / PZero

GNU Affero General Public License v3.0
22 stars 2 forks source link

Entity properties managment #50

Closed gbene closed 2 weeks ago

gbene commented 1 year ago

Hi all,

As we already discussed there is a problem on how entities are managed in PZero. One of the main points brought up by @luca-penasa is that the entities do not know "who they are" nor to which collection they are part of. This is a very important point that needs to be discussed.

Currently in PZero we save all the information of a given entity in a pandas df. This df is visible and accessed by the user through the table view in the different tabs of the main project window. To access the information of an entity (or multiple entities) we slice the df and extract the metadata information that we need; The actual vtk object does not contain any identity information and it is just stored in a column of the df. This approach is well suited for "bulk" queries. For example, extracting the entities that have specific properties/parameters or change/add/remove properties is much quicker than looping through all the vtk entities and check for each entity the different combination of parameters. For single operations on the other hand, having some information directly in the entity can be useful and much more efficient.

An example that I have recently stumbled upon is in the interactive selection of objects in the different views. The system now in place works well when we use the table centered approach but becomes a bit shaky when we want to interactively select the object in the view since we are working directly on the objects. Maybe we should aim to have a system that combines both approaches. Having at least the uuid and collection in the entity thus giving it an "identity" and maintaining the dataframe system that can be used to quickly get the needed information. This can create more complexity because we need to be sure that entity and df have the same information but if we assign to the entities only immutable attributes (uuid, collection etc) it should not be a problem. Still, building a system that connects the two approaches could prove useful in the future.

VTK has the possibility to add string data to the dataset using the field data. We could leverage this to inject a dictionary as a string with the info. This is a quick working code example using pyvista:


import pyvista as pv
import numpy as np
import ast

points = np.random.random(10,3) 

metadata = {'uid':'14d4422e-2662-4379-b41e-b79d14818337','Collection':'DomCollection'}

polydata = pv.Polydata(points)

polydata.field_data['metadata'] = [metadata] #this automatically creates a metadata field data entry. Needs to be in a list or numpy array.

#Retrieve metadata 

ret_metadata = polydata.field_data['metadata'] #this returns a numpy array with the df as a string

ret_metadata_dict =  ast.literal_eval(ret_metadata[0]) #convert the string to an actual df.

uid = ret_metadata_dict['uid']

Let me know your thoughts!

p.s. this is somewhat related https://github.com/andrea-bistacchi/PZero/issues/33

andrea-bistacchi commented 1 year ago

Thoughts:

(1) I think the only case when you need to select an unknown VTK entity and get its name is when selecting or picking in a 2D or 3D view. I might be wrong, so please think about this point that is crucial.

(2) If (1) is true, this function is probably already provided by PyVista, since each actor is named with its uid. This means that duplicating the metadata as field data would not be necessary.

(3) If (1) is NOT true, replicating the metadata as field data could be necessary.

(4) If metadata should be duplicated as field data, ensuring they are always synchronized is fundamental. If we just duplicate uidand collection, this is simpler, but not completely failproof. A critical situation could be when you change the size of an object (i.e. change its topology due to resampling, etc.), since in this case the metadata are not modified but the VTK entity is replaced with a new one in the dataframe cell (this is because VTK entities can be deformed by changing coordinate values, but they cannot grow or shrink in size by changing the number of points or cells, since the size of the underlying arrays is immutable).

andrea-bistacchi commented 2 weeks ago

uid is provided by PyVista, so we do not need to duplicate anything.