Generate a Python API from the Geppetto Model

tarelli commented 8 years ago

We could generate a Python API to be able to create a Geppetto Model in Python. This would be made easy using tools like Acceleo. Python has the advantage of being the most familiar language to scientists, the idea for this API is to be used through the frontend of Geppetto itself allowing therefore users to create their models from the browser.

tarelli commented 7 years ago

@aranega saw your gist here to generate Python code from an ecore https://gist.github.com/aranega/f07e4cb4e850af3288af . Was there any more work done on this that you know of? It would come very useful to us! :) Thanks.

aranega commented 7 years ago

@tarelli This small work performs a Python generation from UML I made for www.genmymodel.com (online modeling plateform with embedded code generation). I know that there is a Python generator from .ecore to Python by Obeo, but I have never succeeded to make it work. Beside this, there is perhaps some generator out there, but I not aware of them, sorry :(.

Regarding your project, it really depends on your input domain and what you want to generate exactly. From my experience, you can always find a way to generate what you want using Acceleo ;). I quickly checked your input metamodel: https://github.com/openworm/org.geppetto.model/blob/master/src/main/resources/geppettoModel.ecore Is you input domain the .ecore file or model that conforms to this metamodel? So, do you want to generate the equivalent of the generated Java API from EMF in Python or to generate a new Python based API from geppetto models? Also, Python2, 3?

LordKrabo commented 7 years ago

@tarelli @aranega I believe it is python 2.7.

tarelli commented 7 years ago

@aranega thanks for getting back! :) Both would be useful for us, an API to operate at the meta level and create models (ecore would be the input, conceptually equivalent to EMF) and an API that can start instead from a given model (e.g. XMI or JSON would be the input) and create a Python object model to navigate that. The first however (metamodel) would probably be the one to start with. As for Python, ideally the generated code would be compatible with both.

aranega commented 7 years ago

@tarelli OK, if I understand right, the idea is to provide a kind of EMF, but in Python instead of Java... Wow, would be great, but it is a huge work, EMF contains a lot of stuffs (adapter, notifications mecanism, binary resources, relationship between metamode, subset, derived features...). Indeed, in a logical order, the meta layer is required in order to handle the XMI/Json representation as deserialization requires to check if the resource conforms to the metamodel. I'm not a Python expert (at all), but I've the impression that a lot of element from Python could ease the code production. If the goal is to manage a subset of EMF/Ecore (i.e, without relationship between metamodel, subset managment, binary resources deserialization...etc), it could be done.

I assume that a lot of people could be interested in a Python-EMF (as long as the produced XMI is entirely compatible with the existing Java-EMF) as the language propose a lot of functionnality that could be close to OCL.

tarelli commented 7 years ago

@aranega Yes, the full blown EMF would be a huge endeavour, I agree. We could maybe start with a small subset proof of concept and then having more advanced functionality implemented incrementally on a per need basis. A subset could be for instance:

Generate Python classes from ecore
Deserialize from XMI/JSON
Serialize to XMI/JSON
No binary support

aranega commented 7 years ago

@tarelli Indeed, a full support seems way to difficult.

A first step could be to produce the code from an ecore model (a simple one) by hand. From this code, extract the general one in order to produce a core library that mimic the EMF API (a Python EObject with the reflexive layer) and write a first generator to generate the ecore metamodel in Python. After that, the XMI format could be handled.

Also, if the generator input model is the .ecore, references between metamodel should be avoided. If metamodels can reference each others, the input model must be the .genmodel. However, references between metamodel are a nightmare from Acceleo because it depends a lot of the way the acceleo script is called. In an Eclipse context, the way the metamodel reference each other can be really different from an Eclipse-free context.

aranega commented 7 years ago

Hi @tarelli, I had few days "almost" free and I started a kind of Python-Ecore project (Python3). At this point I have the 'basic' behavior and I'm able to create dynamic metamodel as well as static metamodel the related instances in both cases. Here is some very simple snippets of how it looks like.

For the 'dynamic' part:

import pyecore.ecore as Ecore

# Simple dynamic metamodel creation
A = EClass('A')
A.eAttributes.append(EAttribute('name', Ecore.EString))
B = EClass('B')
B.eReferences(EReference('a', eType=A))
B.eReferences(EReference('a_col', eType=A, upper=-1))

# Instance creation and sets (the part handled by the end-user)
a1 = A()
a1.name = 'myA'  # type checking is performed at runtime
a2 = A()
a2.name = 'myOtherA'

b = B()
b.a = a1  # Also type checking is performed at runtime
b.a_col.append(a2)

For the 'static' part:

import pyecore.ecore as Ecore

# Simple static metamodel creation
class A(Ecore.EObject, metaclass=Ecore.MetaEClass):
    name = EAttribute(eType=Ecore.EString)

    def __init__(self):
        pass

class B(Ecore.EObject, metaclass=Ecore.MetaEClass):
    a = EReference(eType=A)
    a_col = EReference(eType=A, upper=-1)

    def __init__(self):
        pass

# Instance creation and sets (the part handled by the end-user)
a1 = A()
a1.name = 'myA'  # type checking is performed at runtime
a2 = A()
a2.name = 'myOtherA'

b = B()
b.a = a1  # Also type checking is performed at runtime
b.a_col.append(a2)

The main core plays with reflection and tries to provide an 'almost' Python experience (way of setting parameters/getting them...etc).

Stuffs that are supported:

dynamic/static metamodel creation
instance creation from static/dynamic EClass
eopposites managment for ereferences
inheritance
abstract EClass (parameter for dynamic instances or decorator for static ones)
runtime type checking

The thing that still remain to do are (short term):

Eclipse XMI import/export (as XMI is mainly used by EMF, I think it is best to use it, Json later)
the static metamodel generator from .ecore (in a first place)
Object deletion
Meta-operation management in dynamic mode (currently, you can just define them)
Tests
Documentation
Clean the code (currently quite ugly as f***)

The thing that still remain to do are (long term):

Notification/Event system
Command system (?)
the static metamodel generator from .genmodel

Also, keep in mind that I'm no Python expert and that the API could suffer from major changes during its lifetime.

I'm aware that at this point, it is still not closed from your requirements but it's a first step.

In the same time, I performed some researches and I found this: http://www.lifl.fr/~marvie/software/pyemof.html The project is old, but perhaps it is better (and also works better) than the one I'm working on.

tarelli commented 7 years ago

@aranega that looks great! Is the code available in some repo yet?

aranega commented 7 years ago

@tarelli Not yet, I have to add some files in before: clean experimentations and provide some examples.

I will try to release something today or at least before the end of the week (sorry about that). I just performed some tests and at the moment everything seems fine, but obviously more tests are required. The missing XMI import/export is also a huge "blocker" at the moment (working on it).

tarelli commented 7 years ago

@aranega no worries, thanks a lot for doing this! Once we review what you have we can look at intersections between what we have done manually to figure out the exact requirements and what is generate automatically. cc @adrianq

aranega commented 7 years ago

My pleasure, I had the project of working on such things and you gave me the perfect opportunity. Regarding the code generation, you'll need to generate the equivalent of the 'static part' for your own metamodel.

I tried to keep the mapping easy so code generation would be easy to perform. I will provide more advanced examples in the project repository. At the moment, the idea is to keep this kind of mapping:

EPackage -> python module
EClass that inherits from nothing -> python class that inherits from Ecore.EObject and with a special metaclass
EAttribute -> python class property (Ecore.EAttribute)
EReferences -> python class property also, but should be put after all class definitions (mandatory at the moment)

When XMI import/export will be done, I will try to create such a generator.

aranega commented 7 years ago

@tarelli So, code's here: https://github.com/aranega/pyecore I will add more complete examples and continue bugfixes and my work on the other points. In the same time, I'm adding some tests to improve/ensure the project stability.

tarelli commented 7 years ago

@aranega that's great, thanks! I will start playing with it over Christmas holidays :)

aranega commented 7 years ago

@tarelli Just a brief update on my progresses, the lib is more stable (it still needs love) and I'm still working on the XMI deserialization/serialization. The XMI import is on good tracks, I was able to deserialize the Eclipse Ecore metamodel and Eclipse UML (for tests purposes). Of course, everything is not perfect, but it's a good step forward :)

EDIT> I pushed a new version with a first XMI serialization (basic at the moment), you should be able to read a metamodel/mode instances from files, to modify the data and to write the result in a new resource (or in the same one).

aranega commented 7 years ago

@tarelli Sorry for the multiple posts, continued to work on the little project. I was able to load a (the?) geppetto metamodel, register all it's packages, load an XMI test model, modify it and save the new model as XMI. I put the demo code at this address: https://gist.github.com/aranega/89761454a232106a20ef4184bd9198cf

It is a little huge at the moment, but keep in mind that I needed tp:

create an XMLType special resource with basic types (your metamodel uses XMLTypes as basic types instead of the basic Ecore one),
register each nested packages (also required in EMF-Java)

All together is around 80LOC, but with proper splitting and stuffs, all could fit in only few lines.

This allows you to deal with dynamic instances of your metamodel, a static metamodel generation is still preferable, but it requires a little bit more of work.

tarelli commented 7 years ago

@aranega that's great! We were trying to load the ecore on Tuesday during the Geppetto meeting but we had some errors, will try again with your snippet. Btw if you have time I would be glad to invite you to the next Geppetto meeting on the 17th of January to present what you have done so far! Again thanks so much for doing this :1st_place_medal:

aranega commented 7 years ago

@tarelli I performed some fixes since tuesday (check the master branch), so this time it should load! I must apologies, error messages are not that great at the moment (still need to work on this also). Actually, the two things I totally forgot about your metamodel is that it contains nested packages and it uses xmltypes and these implies some special handling regarding the emf resource set. As you could imagine, EMF is quite huge and taming it requires some time, I hope with time all will be more stable and advanced. By the way, if you have many tests models or a repository where I could test the lib against, it could be great!

Regarding the metamodel loading, with the proper generator, this will not be an issue anymore as the metamodel would be in a Python "format", performances should be better also. Actually, with Python, the only interest of having the static metamodel generated is to avoid the ecore loading and to be able to define methods on the metaclasses.

About your meeting, it would be my pleasure to talk to you about all of this and get a better insight of geppetto, but it really depends on the schedule (hour and stuffs).

tarelli commented 7 years ago

@aranega the meeting is tomorrow Tuesday the 17th at 4pm GMT if you can join!

aranega commented 7 years ago

@tarelli Ouch, no luck on my side, I have some meetings scheduled more or less at the same time. I'll check tomorrow morning If I can move some of them (don't worry, I'll tell you early enough, sorry for the inconvenience). I hope I will be able to join you!

(PS: I pushed more modifications on the project, the static generator is almost done)

aranega commented 7 years ago

@tarelli I moved some meetings and I'm available for the meeting at 4pm GMT (5pm here).

LordKrabo commented 7 years ago

Hi @aranega I have been testing some of your code, and I keep running into this error:

from pyecore.resources import ResourceSet, URI Traceback (most recent call last): File "", line 1, in File "pyecore/resources/init.py", line 1, in from .resource import ResourceSet, Resource, URI, global_registry File "pyecore/resources/resource.py", line 2, in import urllib.request ImportError: No module named request

Don't know if that helps at all.

aranega commented 7 years ago

@LordKrabo I think I have an idea, PyEcore is designed to work with Python >= 3.3, perhaps you are using Python 2?

tarelli commented 7 years ago

@aranega great, what email should I invite?

aranega commented 7 years ago

@tarelli You can get the email from my profile: vincent.aranega{AT}gmail.com . Do you use webex or something like that?

tarelli commented 7 years ago

@aranega https://plus.google.com/hangouts/_/calendar/YnF2bHJtNjQybTNpcmplaGJldGhva2tjZGdAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ.r02shtr10hagn2ob61ndeo5cq0?authuser=0 on hangout, there should be an event in your calendar now!

aranega commented 7 years ago

@tarelli right, that's good. Hangout's perfect!

LordKrabo commented 7 years ago

@aranega I think you are correct, I have been switching to Python 3.3 and upwards but have been hitting different errors.

~/geppetto-sources/pyecore $ pyenv global 3.5.2 ~ /geppetto-sources/pyecore $ python Python 3.2.6 (default, Jan 18 2017, 02:41:24) [GCC 4.8.4] on linux2 Type "help", "copyright", "credits" or "license" for more information.

from ordered_set import OrderedSet
ImportError: No module named ordered_set File "", line 1 from ordered_set import OrderedSet ^ IndentationError: unexpected indent File "", line 1 ImportError: No module named ordered_set ^ SyntaxError: invalid syntax from pyecore.resources import ResourceSet, URI Traceback (most recent call last): File "", line 1, in File "pyecore/resources/init.py", line 1, in from .resource import ResourceSet, Resource, URI, global_registry File "pyecore/resources/resource.py", line 3, in import pyecore.ecore as Ecore File "pyecore/ecore.py", line 2, in from ordered_set import OrderedSet ImportError: No module named ordered_set

I guess we will discuss this soon enough.

aranega commented 7 years ago

@LordKrabo It seems that some dependencies didn't installed correctly (we will discuss this of course).

LordKrabo commented 7 years ago

@aranega Of course. Forgive me for my forgetfulness, but what did you say the two dependencies were?

aranega commented 7 years ago

@LordKrabo No problem, there are: lxml and ordered-set. If you do a python setup.py install they should install also (if they don't, please, tell me, I perhaps forgot something in the setup.py).

LordKrabo commented 7 years ago

@aranega Sorry for the delay, tried this again and I am having issues importing additional modules from resources.py

from pyecore.resources import ResourceSet, URI Traceback (most recent call last): File "", line 1, in ImportError: cannot import name ResourceSet

aranega commented 7 years ago

@LordKrabo that strange, the ResourceSet import should be fine. Perhaps it comes from the way the imports are referencing each others in PyEcore (I will change that). From which location did you launch your test script?

aranega commented 7 years ago

@LordKrabo I've just hotfixed the master branch with a new version that uses relative pathes instead of absolute onces, that should either fix the issue you have or give more information.

LordKrabo commented 7 years ago

I was launching my test script from the top level directory (~/pyecore). I pulled your changes and I am still running into problems with importing ResourceSet:

from pyecore.resources import ResourceSet, URI Traceback (most recent call last): File "", line 1, in ImportError: No module named resources

Sorry to be such a nuisance!

aranega commented 7 years ago

@LordKrabo don't worry, this is great that you can test it, I'm sorry that it is so much painful at the moment, there is probably some kind of tricks or side effects that I'm not aware of with Python. In your last error message, it looks like it cannot find the 'resources' package which is located in the pyecore one. As if it could not find the __init__.py file in pyecore/resources which is weired, in Python3 all subpackage from a module that can be imported is "marked" as "importable" (not sure that word exists). Could you try to create a new viftualenv with python3 installed, install pyecore and try your tests again? I use pew as virtualenv wrapper (this tool is great https://github.com/berdario/pew):

# if you have python3 by default on your system
$ pew new testenv  
# if you don't have python3 by default on your system
$ pew new -p /usr/bin/python3 testenv

Then, go in the cloned repository (I assume ~/pyecore) and do:

$  python install setup.py

This will install pyecore and all the dependencies. Once everything is installed, you can try to run ipython or python from every location and see if from pyecore.resources import ResourceSet works.

I hope this will work ;)

LordKrabo commented 7 years ago

@aranega Sadly still running into the same error. I created the new testenv with the first command, reran setup.py and when I tried running from pyecore. I note that init.py has not been placed in the pyecore/resources directory on my installation.

aranega commented 7 years ago

@LordKrabo the __init__.py file exists in pyecore/resources in the repository, so I've ran some tests in order to re-create your issue. I started with a fresh clone of the repository and there were no issues on my side. Also, I use travis-ci for tests with tox which clones/installs/runs the tests and it didn't report me errors. Anyway, I successfully reproduced your issue by manually removing the __init__.py file from resources. Perhaps something went wrong on your cloned repository? Could you try to git checkout -- . in the cloned repository and see if the file pops back. If it does not, I guess a fresh new repository clone should fix the issue.

LordKrabo commented 7 years ago

@aranega Tried git checkout -- . and that seemed to reimport init.py. Thanks!

On another note, looking at the geppettoModel.ecore file, what should the ecore files refer to instead of the eclipse ecore URLs? http://www.eclipse.org/emf/2002/Ecore

tarelli commented 7 years ago

@LordKrabo thanks for testing the above! @aranega I invited you to the Geppetto contributors, you should find the invitation visiting this. This is the repo where you can commit the API https://github.com/openworm/pygeppetto/tree/master. And these are three XMIs that you can use to test it! Thanks again!

aranega commented 7 years ago

@LordKrabo There is nothing wrong in the .ecore file per say only some . The URI http://www.eclipse.org/emf/2002/Ecore should not be removed, it gives EMF an indication about the metamodel which have been used for describing your model. In your case, your model is a metamodel, so it references the Ecore URI.

The URI that is a little bit different in the geppetto metamodel is the one used to references types. For example, http://www.eclipse.org/emf/2003/XMLType#//String. Please, note that this is not and issue, it simply references metatypes from another metamodel (the metamodel with this URI: http://www.eclipse.org/emf/2003/XMLType). You probably derived your .ecore from an xsd, in this case, EMF tries to stay compliant with the choices you've made in your xsd (I'm not 100% sure however).

Anyway, this only have impacts on my side. The question is: should I generate a XMLType metamodel implementation in Python, or embedded the xmltype.ecore file and load it at runtime as a dynamic metamodel or should I only 'transform' types during the code generation?

At the moment, the solution I chose is to 'transform' types during the code generation. I'm not sure that this is the best solution, but it will work (actually, it work).

@tarelli Thanks for the invitation, the repo and the XMI files. I've run very quickly the tests in dynamic mode on the tree models with the current develop version:

dynamic metamodel

model1 (Big) -> it loads (around 10.5s) model2 (Large) -> it loads (around 11.5s) model3 (Medium) -> it does not load, it seems that there is many XMI embedded in this one (by the way, it is called Medium, but it is clearly bigger than the others, two times bigger than Large). The XML parser cannot read it.

As you see, performances are not great. I'm currently working on a new version (branch feature/refactoring) and I ran the tests in a dynamic and static mode:

dynamic metamodel new version

model1 (Big) -> it loads (around 5.7s) model2 (Large) -> it loads (around 6.4s) model3 (Medium) -> cannot load it (issues in the XML?)

static metamodel (with generated metamodel) new version

model1 (Big) -> it loads (around 5.5s) model2 (Large) -> it loads (around 6.2s) model3 (Medium) -> cannot load it (issues in the XML?)

With the new version, performances are better (2x faster), but it still does not sound so great :\ (model loading only). I hope I will be able to find a solution to enhance everything soon without breaking the retro-compatibility...

At least, I will try to push a version of the geppetto python API with the new version. Also, I saw on the commit logs that you purged the whole repository but it seems that there were work on UI and computations. These will not be missed?

Thanks again for the invitation, the repository and the test files!

LordKrabo commented 7 years ago

No worries. Would you like me to add some of the these prerequisites as part of setup.py?

aranega commented 7 years ago

@LordKrabo you mean about the .ecore?

tarelli commented 7 years ago

@aranega what error do you get with Medium Net? It loads fine with the dynamic editor. As for the stuff that I purged we had already moved it elsewhere so no worries but thanks for double checking :)

screen shot 2017-01-28 at 13 00 41

aranega commented 7 years ago

@tarelli I have the same issue under eclipse, the XMI does not seems valid. PyEcore (actually lxml) report me an issue at line 3319 and so does the SAX parser under Eclipse. The file I get from gist is quite large (> 127.000 lines) and at line 3319 it looks like there is another <?xml ..> definition. Almost as if many .xmi have been concatenated together.

tarelli commented 7 years ago

@aranega I reuploaded MediumNet.net.nml.xmi on the gist, maybe there was an error as I drag and dropped them, the file should have only 3318 lines :)

LordKrabo commented 7 years ago

@aranega I was thinking more the pew dependencies and the testenv commands above.

aranega commented 7 years ago

@tarelli Thanks! I was a little worried that Big < Large < Medium :). So, i've tested this time with the Medium model, it loads, no issue. With the new version (which is the default one now), I have these times measured:

Medium model - dynamic metamodel -> loads ok (around 0.950s) Medium model - static metamodel -> loads ok (around 0.500s)

I released a new version of pyecore, from 0.0.10 to 0.1.1 (0.1.0 is a lost version), so I will commit very soon the Geppetto API on the repository (probably tomorrow, I just need to check some points). Also, pyecore is now on pypi, a simple pip install pyecore is now enough to install it.

@LordKrabo I don't think this is mandatory, the use of pew and tox is a matter of choice on my side and the fact to work with virtualenvs is more a general guideline when dealing with external python package, but it is not required to list them as dependency in the setup.py. However, these recommandations could be introduced in the README as they help to quite a sane system.

aranega commented 7 years ago

@tarelli @LordKrabo I've pushed a version of the pygeppetto API in the developement of the repository (https://github.com/openworm/pygeppetto/tree/development). I've quickly written some directions in the README.md, but there is still some questions that remain. I also released a new version of PyEcore (0.1.2) in pypi. If you have already installed a version of pyecore using pip, you can upgrade the package using: $ pip install pyecore --upgrade.

aranega commented 7 years ago

@tarelli Sorry I had a lot of work lately, I didn't find the time to join you on your last meeting. I'm planning to add uri <--> EPackage mapping in the root package for the pygeppetto api this week-end, as well as the mapping between your 'master' URI and your 'developement' one (I didn't notive that you have two different URI). Does this sounds right to you?

openworm / org.geppetto