Draft generator - Githubissues

kitchoi commented 8 years ago

As I was playing with the metadata (my way of understanding it more deeply), I also made this generator. Bits of this maybe useful? @roigcarlo

kitchoi commented 8 years ago

I tested this generator code with the simphony_metadata.yml in the update_yaml_files ( #7 ). I also added a number of "FIXME" commands in the code. These are items that I think we should discuss either in a separate Issue thread or on the wiki.

Here is a summary of these items:

(1) the shape syntax (already in #9 )

(2) For some CUBA keys, there are property names which are CUBA keys as well. I don't know what special purpose they serve. For examples, CUBA.DESCRIPTION and CUBA.NAME are attributes of CUDS_COMPONENT. How is this different from using "description" and "name" in the yaml file? Should we be loading the cuba.yml file to check the shape etc?

(3) If an attribute has a given value, is it read-only? (e.g. definition: something)

(4) There are going to be some utility functions, where do we put them? (e.g. a function for checking a given value comply with shape)

(5) CUBA.DATA must be always empty, is that right?

(6) Some default values are CUBA key, we keep them as is?

Again, this WIP PR is not meant for merging but serves as a proof of idea as we go along. The target generator code should provide command line functions similar to the good old cuba_generator.py

ahashibon commented 8 years ago

(2) For some CUBA keys, there are property names which are CUBA keys as well. I don't know what special purpose they serve. For examples, CUBA.DESCRIPTION and CUBA.NAME are attributes of CUDS_COMPONENT. How is this different from using "description" and "name" in the yaml file? Should we be loading the cuba.yml file to check the shape etc?

Please see the SSB wiki

in particular: here

ahashibon commented 8 years ago

(3) If an attribute has a given value, is it read-only? (e.g. definition: something)

please see https://github.com/simphony/simphony-metadata/pull/8#issuecomment-197366545

ahashibon commented 8 years ago

(5) CUBA.DATA must be always empty, is that right?

correct, this is the existing (flat) data structure of SimPhoNy, which is already implemented.

ahashibon commented 8 years ago

(6) Some default values are CUBA key, we keep them as is?

Yes!

mehdisadeghi commented 8 years ago

(4) There are going to be some utility functions, where do we put them? (e.g. a function for checking a given value comply with shape)

For now we can add it to the base class of all cuds, i.e. CUDSItem.

I have also some suggestions:

I noticed the script generates Cuds instead of CUDS in class names. Since CUDS is an acronym it is better to be written in all caps, e.g. CUDSItem, CUDSComponent, etc.
I also tried find_missing_cuba.py. There are keys that do not exist in CUBA but are missing in the script's output. A few examples are ISOTHERMAL_MODEL, LAMINAR_FLOW_MODEL, CONSTANT_ELECTROSTATIC_FIELD_MODEL, etc.
I prefer one module which contains all the definitions (i.e. models, entities, Data Transfer Objects). This way it is easier to import them, both dynamically by other pieces of code and by users. I don't see any benefit in having many tiny modules. It reminds me the Java convention of having one file per class.
In case of having one module per class use relative imports to import others.

Nice generator though.

kitchoi commented 8 years ago

For now we can add it to the base class of all cuds, i.e. CUDSItem.

If the function is general enough to be put in the base class and be shared by the subclasses, then I think it is a good candidate as a module-level function, which I think is simpler.

I noticed the script generates Cuds instead of CUDS in class names. Since CUDS is an acronym it is better to be written in all caps, e.g. CUDSItem, CUDSComponent, etc.

Agree. Can make a set of special words and pass it to the to_camel_case function.

I also tried find_missing_cuba.py. There are keys that do not exist in CUBA but are missing in the script's output. A few examples are ISOTHERMAL_MODEL, LAMINAR_FLOW_MODEL, CONSTANT_ELECTROSTATIC_FIELD_MODEL, etc.

ISOTHERMAL_MODEL, LAMINAR_FLOW_MODEL and CONSTANT_ELECTROSTATIC_FIELD_MODEL (etc.) are defined in simphony_metadata.yml already. See @tuopuu comment: https://github.com/simphony/simphony-metadata/pull/7#issuecomment-197362677 Therefore the generator for cuba.py will need to read simphony_metadata.yml as well.

I prefer one module which contains all the definitions (i.e. models, entities, Data Transfer Objects). This way it is easier to import them, both dynamically by other pieces of code and by users. I don't see any benefit in having many tiny modules. It reminds me the Java convention of having one file per class.

Then we need to be careful to avoid import cycle. The benefit of having each class as a separate file is that: (1) the only possibility for an import cycle is if the meta schema defines a direct or indirect cycle in the relationship tree, then it is the fault of the schema (2) it is easier for the generator, you don't have to sort the classes within a module

In addition, the yml file does not offer information for how to organise these classes, which means that the organisation cannot be automated.

mehdisadeghi commented 8 years ago

Then we need to be careful to avoid import cycle

I am afraid I can't see how import cycles might happen in one module with many class definitions. These modules should be very simple and independent from the rest of the application and must have no (or very trivial) imports, except from the standard library, which will not cause any cyclic imports.

The benefit of having each class as a separate file

The unit of decomposition and reuse in python is module, and it makes sense to put classes which are tightly related inside one module. However, one might create one module for each high level component, i.e. material_relations, physics_equations, etc.

In addition, the yml file does not offer information for how to organise these classes

This is true, however the order of entities is irrelevant for the metadata. If we can find a solution to generate classes in a fixed and working order it will solve the problem. Otherwise we can have multiple modules but we should not import anything directory from those modules, instead, import them from a higher level module.

kitchoi commented 8 years ago

I am afraid I can't see how import cycles might happen in one module with many class definitions. These modules should be very simple and independent from the rest of the application and must have no (or very trivial) imports, except from the standard library, which will not cause any cyclic imports.

So I must have misread the original sentence "I prefer one module which contains all the definitions (i.e. models, entities, Data Transfer Objects)." to mean having one module for each of the "models", "entities" and "Data Transfer Objects". Yes, there would not be import cycle if you put everything in one module.

I agree that some organisation would be convenient, e.g. grouping model classes into a model modules, grouping physical entities into an entity module. The person who has to manually group generated classes into a number of modules will inevitably have to make sure no import cycle is created. In addition, it does not seem to me that this organisation should bother the meta schema, but places where the meta schema is required (e.g. simphony-common, simphony-lammps-md...etc).

kitchoi commented 8 years ago

Alternative to printing all classes into one file, one could also provide an api module that collect all the classes for easy access.

mehdisadeghi commented 8 years ago

one could also provide an api module that collect all the classes for easy access

We should definitely do that. No matter if we put classes in single module or multiple modules, they should be accessible the same way, like this:

from simphony.core import PE; MR, etc.
# or
from simphony.cuds import PE, MR, etc.
# or
from simphony import PE, MR, etc.

Regardless, I am in favor of having only one module which contains all the generated classes, unless otherwise is necessary.

mehdisadeghi commented 8 years ago

@kitchoi @roigcarlo I think it is a good idea to include the CUBA key for each class inside it. For example:

class Atomistic(ComputationalModel):

def __init__(self):
    pass

@property
def definition(self):
    return "Atomistic model category according to the RoMM"

@property
def cuba_key(self):
    return CUBA.ATOMISTIC

mehdisadeghi commented 8 years ago

Please consider #14 merged PR for the generator.

kitchoi commented 8 years ago

https://publicwiki-01.fraunhofer.de/SimPhoNy-Project/index.php/SimPhoNy_Metadata_Schema#Converting_a_general_metadata_element_to_a_class

kitchoi commented 8 years ago

Superseded by #17, closing.

simphony / simphony-metadata

Draft generator #8