simphony / simphony-common

The native implementation of the Simphony cuds objects
BSD 2-Clause "Simplified" License
7 stars 5 forks source link

Feature: Add support for restricted cuba key in DataContainer #293

Closed kitchoi closed 8 years ago

kitchoi commented 8 years ago

In order to replicate the keyword checking in MaterialRelation (https://github.com/simphony/simphony-common/blob/master/simphony/cuds/material_relations/material_relation.py#L107) for all the generated classes from simphony-metadata, it would be more effective if the checking is done at the DataContainerlevel.

For example:

>>> from simphony.core.data_container import create_data_container
>>> container = create_data_container((CUBA.NAME, CUBA.VELOCITY))()
>>> container.restricted_keys
frozenset({<CUBA.VELOCITY: 55>, <CUBA.NAME: 61>})
>>> container[CUBA.NAME] = 'name'
>>> container[CUBA.POSITION] = (1.0, 1.0, 1.0)
...
ValueError: Key <CUBA.POSITION: 119> is not in the supported CUBA keywords

Performance of the DataContainer is basically the same to the one we have now. Benchmark:

Initialization:
dict: 1000000 calls, best of 5 repeats: 0.000001 sec per call
DataContainer: 100000 calls, best of 5 repeats: 0.000011 sec per call  (was 0.000010)
dict == DataContainer True

Iterations:
dict: 1000000 calls, best of 5 repeats: 0.000001 sec per call
DataContainer: 1000000 calls, best of 5 repeats: 0.000001 sec per call  (same)

getitem access:
dict: 100000 calls, best of 5 repeats: 0.000007 sec per call
DataContainer: 100000 calls, best of 5 repeats: 0.000012 sec per call  (same)
dict == DataContainer True

setitem with CUBA keys:
dict: 10000 calls, best of 5 repeats: 0.000024 sec per call
DataContainer: 10000 calls, best of 5 repeats: 0.000061 sec per call  (was 0.000058)
dict == DataContainer True

(Running with Ubuntu 12 on a VM with 1 core, 2.5GHz, 4GB RAM)

Please consider this as a proposal and please comment, review or propose alternatives.

codecov-io commented 8 years ago

Current coverage is 96.62%

Merging #293 into master will increase coverage by <.01%

@@             master       #293   diff @@
==========================================
  Files            49         49          
  Lines          3814       3814          
  Methods           0          0          
  Messages          0          0          
  Branches        570        566     -4   
==========================================
+ Hits           3684       3685     +1   
  Misses           42         42          
+ Partials         88         87     -1   

Powered by Codecov. Last updated by 1559e47...f0451df

kemattil commented 8 years ago

Restricting keywords for a data container (e.g. by specifying a fixed set of supported keywords at the initialization) is something we have been discussing for a long time now and is related to several aspects of the SimPhoNy design (see at least issues #29, #216, #217, #223).

I continue to favour implementing this feature.

In addition, the proposed implementation with a restricted subclass of the data container and utilization of the frozenset seems elegant.

mehdisadeghi commented 8 years ago

It seems to me that DataContainer and CUDSItem are very similar and serve similar purposes. Why not consider DataContainer as the base class for metadata classes?

mehdisadeghi commented 8 years ago

I'm in favor of type and consistency checking in metadata classes and CUDS itself. IMO the functionality of create_data_container should be merged in generated classes.

kitchoi commented 8 years ago

IMO the functionality of create_data_container should be merged in generated classes.

Do you mean using create_data_container in the generated class or moving the code (def create_data_...) to the generated class module? My intention is for the former, so that with, say, the MaterialRelation class:

>>> material_relation = MaterialRelation()

>>> # same as material_relation.supported_parameters, but a set
>>> material_relation.data.restricted_keys
frozenset({CUBA.MATERIAL})

>>> material_relation.data[CUBA.POSITION] = (1.0, 1.0, 1.0)
ValueError: Key <CUBA.POSITION: 119> is not in the supported CUBA keywords
mehdisadeghi commented 8 years ago

Your latest code example is close to what I intended and looks good to me. As long as the use of the factory method is internal, it is fine.

kitchoi commented 8 years ago

Just added an optional argument to the create_data_container function for setting class name so that one can assign the dynamically created class to a (conventionally) private variable while still have pickling support.