Closed kitchoi closed 8 years ago
I tested this generator code with the simphony_metadata.yml in the update_yaml_files
( #7 ).
I also added a number of "FIXME" commands in the code. These are items that I think we should discuss either in a separate Issue thread or on the wiki.
Here is a summary of these items:
(1) the shape syntax (already in #9 )
(2) For some CUBA keys, there are property names which are CUBA keys as well. I don't know what special purpose they serve. For examples, CUBA.DESCRIPTION and CUBA.NAME are attributes of CUDS_COMPONENT. How is this different from using "description" and "name" in the yaml file? Should we be loading the cuba.yml
file to check the shape etc?
(3) If an attribute has a given value, is it read-only? (e.g. definition: something
)
(4) There are going to be some utility functions, where do we put them? (e.g. a function for checking a given value comply with shape
)
(5) CUBA.DATA must be always empty, is that right?
(6) Some default values are CUBA key, we keep them as is?
Again, this WIP PR is not meant for merging but serves as a proof of idea as we go along. The target generator code should provide command line functions similar to the good old cuba_generator.py
(2) For some CUBA keys, there are property names which are CUBA keys as well. I don't know what special purpose they serve. For examples, CUBA.DESCRIPTION and CUBA.NAME are attributes of CUDS_COMPONENT. How is this different from using "description" and "name" in the yaml file? Should we be loading the cuba.yml file to check the shape etc?
Please see the SSB wiki
in particular: here
(3) If an attribute has a given value, is it read-only? (e.g. definition: something)
please see https://github.com/simphony/simphony-metadata/pull/8#issuecomment-197366545
(5) CUBA.DATA must be always empty, is that right?
correct, this is the existing (flat) data structure of SimPhoNy, which is already implemented.
(6) Some default values are CUBA key, we keep them as is?
Yes!
(4) There are going to be some utility functions, where do we put them? (e.g. a function for checking a given value comply with shape)
For now we can add it to the base class of all cuds, i.e. CUDSItem.
I have also some suggestions:
Cuds
instead of CUDS
in class names. Since CUDS is an acronym it is better to be written in all caps, e.g. CUDSItem, CUDSComponent, etc.find_missing_cuba.py
. There are keys that do not exist in CUBA but are missing in the script's output. A few examples are ISOTHERMAL_MODEL, LAMINAR_FLOW_MODEL, CONSTANT_ELECTROSTATIC_FIELD_MODEL, etc.Nice generator though.
For now we can add it to the base class of all cuds, i.e. CUDSItem.
If the function is general enough to be put in the base class and be shared by the subclasses, then I think it is a good candidate as a module-level function, which I think is simpler.
I noticed the script generates Cuds instead of CUDS in class names. Since CUDS is an acronym it is better to be written in all caps, e.g. CUDSItem, CUDSComponent, etc.
Agree. Can make a set of special words and pass it to the to_camel_case
function.
I also tried find_missing_cuba.py. There are keys that do not exist in CUBA but are missing in the script's output. A few examples are ISOTHERMAL_MODEL, LAMINAR_FLOW_MODEL, CONSTANT_ELECTROSTATIC_FIELD_MODEL, etc.
ISOTHERMAL_MODEL, LAMINAR_FLOW_MODEL and CONSTANT_ELECTROSTATIC_FIELD_MODEL (etc.) are defined in simphony_metadata.yml
already. See @tuopuu comment: https://github.com/simphony/simphony-metadata/pull/7#issuecomment-197362677
Therefore the generator for cuba.py
will need to read simphony_metadata.yml
as well.
I prefer one module which contains all the definitions (i.e. models, entities, Data Transfer Objects). This way it is easier to import them, both dynamically by other pieces of code and by users. I don't see any benefit in having many tiny modules. It reminds me the Java convention of having one file per class.
Then we need to be careful to avoid import cycle. The benefit of having each class as a separate file is that: (1) the only possibility for an import cycle is if the meta schema defines a direct or indirect cycle in the relationship tree, then it is the fault of the schema (2) it is easier for the generator, you don't have to sort the classes within a module
In addition, the yml file does not offer information for how to organise these classes, which means that the organisation cannot be automated.
Then we need to be careful to avoid import cycle
I am afraid I can't see how import cycles might happen in one module with many class definitions. These modules should be very simple and independent from the rest of the application and must have no (or very trivial) imports, except from the standard library, which will not cause any cyclic imports.
The benefit of having each class as a separate file
The unit of decomposition and reuse in python is module, and it makes sense to put classes which are tightly related inside one module. However, one might create one module for each high level component, i.e. material_relations, physics_equations, etc.
In addition, the yml file does not offer information for how to organise these classes
This is true, however the order of entities is irrelevant for the metadata. If we can find a solution to generate classes in a fixed and working order it will solve the problem. Otherwise we can have multiple modules but we should not import anything directory from those modules, instead, import them from a higher level module.
I am afraid I can't see how import cycles might happen in one module with many class definitions. These modules should be very simple and independent from the rest of the application and must have no (or very trivial) imports, except from the standard library, which will not cause any cyclic imports.
So I must have misread the original sentence "I prefer one module which contains all the definitions (i.e. models, entities, Data Transfer Objects)." to mean having one module for each of the "models", "entities" and "Data Transfer Objects". Yes, there would not be import cycle if you put everything in one module.
I agree that some organisation would be convenient, e.g. grouping model classes into a model modules, grouping physical entities into an entity module. The person who has to manually group generated classes into a number of modules will inevitably have to make sure no import cycle is created. In addition, it does not seem to me that this organisation should bother the meta schema, but places where the meta schema is required (e.g. simphony-common, simphony-lammps-md...etc).
Alternative to printing all classes into one file, one could also provide an api module that collect all the classes for easy access.
one could also provide an api module that collect all the classes for easy access
We should definitely do that. No matter if we put classes in single module or multiple modules, they should be accessible the same way, like this:
from simphony.core import PE; MR, etc.
# or
from simphony.cuds import PE, MR, etc.
# or
from simphony import PE, MR, etc.
Regardless, I am in favor of having only one module which contains all the generated classes, unless otherwise is necessary.
@kitchoi @roigcarlo I think it is a good idea to include the CUBA key for each class inside it. For example:
class Atomistic(ComputationalModel):
def __init__(self):
pass
@property
def definition(self):
return "Atomistic model category according to the RoMM"
@property
def cuba_key(self):
return CUBA.ATOMISTIC
Please consider #14 merged PR for the generator.
Superseded by #17, closing.
As I was playing with the metadata (my way of understanding it more deeply), I also made this generator. Bits of this maybe useful? @roigcarlo