datafoodconsortium / connector-codegen

An Acceleo project to generate the source code of the DFC connector into all the supported programming languages.
MIT License
3 stars 3 forks source link

TypeScript implementation #20

Open jgaehring opened 6 months ago

jgaehring commented 6 months ago

This should represent all the main features for the implementation, to eventually resolve datafoodconsortium/connector-typescript#4, except for a few outstanding issues:

  1. Failing to import several interfaces, such as IPerson, ISocialMedia, and IPhoneNumber, into classes Agent, Enterprise, etc. These seem to coincide with the features added in 2.1.0, or maybe that's just a red herring.
  2. Misaligned types for getters in DefinedProducts, specifically getters for multiple properties that implement subclasses of ICharacteristic, such as getPhysicalCharacteristics, getAllergenCharacteristics. I haven't identified any others. Possibly something to do with the ISKOSConcept, which also seems to be having some issues getting imported correctly for the interface definitions.
  3. No tests yet, and also need to find a good way to test the prefix mappings to the semantic IRI's (e.g., "dfc-b:Catalog").

Regarding those prefix mappings, I see that the PHP semantizer does have a setPrefix method that sets it for the connector:

https://github.com/datafoodconsortium/connector-codegen/blob/c177f39c80f6f7e3c71566e65caabd52afa175d5/src/org/datafoodconsortium/connector/codegen/php/static/src/Connector.php#L17-L21

But as far as I can tell, there's no such analog yet in the TypeScript semantizer.

I'm leaving this as a draft for now until those issues are resolved or worked around, but if easier to address them in the meantime by marking it for review, @lecoqlibre or @RaggedStaff, please go ahead!

jgaehring commented 6 months ago

2. Misaligned types for getters in DefinedProducts, specifically getters for multiple properties that implement subclasses of ICharacteristic, such as getPhysicalCharacteristics, getAllergenCharacteristics. I haven't identified any others. Possibly something to do with the ISKOSConcept, which also seems to be having some issues getting imported correctly for the interface definitions.

I fixed the underlying issues here, which I believe did stem from changes in 2.0.0. This leads me to think if first issue listed above, regarding imports for interfaces like ISocialMedia and IMainContactOwner, may indeed be related to the changes in 2.1.0.

I found that a similar issue, where ISKOSConcept failed to be imported and SKOSConcept was imported instead. I was able to remedy this in product.uml, where I changed the href from the id for SKOSConcept to ISKOSConcept (jgaehring/data-model-uml@2e78e7e):

-      <elementImport xmi:id="_SQ0AwOqNEeyCo6a7S9M9vA">
-        <importedElement xmi:type="uml:Class" href="skos.uml#_XhC24E2LEe2Pq5gmFYw1MQ"/>
+      <elementImport xmi:id="_6HIqYE76Ee2TpoYuSsmzTA">
+        <importedElement xmi:type="uml:Interface" href="skos.uml#_r06YgE1mEe2Pq5gmFYw1MQ"/>

I feel I'm starting to exceed my proper understanding of the UML model and don't want to be introducing changes like this without additional feedback. I'll see what headway I can make with the last issue, regarding tests, but will mark this ready for review so that others can suggest improvements.

jgaehring commented 6 months ago

Oh great, all those points make sense. Thanks, @lecoqlibre! :pray:

would need push access to your fork in order to add my commits to this PR, could you allow me?

I believe this is enabled now? If you have difficulty pushing to it let me know.

Old references to static context should be replaced with https://www.datafoodconsortium.org (in the Connector class and in tests)

Ah, yes I saw this in the original TS codegen when I tried to run the tests there too and wondered. I can do a quick search and replace easily enough and push that up first.

Abstract sub classes should not re-implement methods already implemented in parents (like some methods in Agent sub classes).

Got it, I think I can fix that easily enough.

jgaehring commented 6 months ago

Old references to static context should be replaced with https://www.datafoodconsortium.org (in the Connector class and in tests)

Ah, yes I saw this in the original TS codegen when I tried to run the tests there too and wondered. I can do a quick search and replace easily enough and push that up first.

I just realized that some places in the PHP static code it references the JSON and RDF data from other DFC repositories, like below:

https://github.com/datafoodconsortium/connector-codegen/blob/c177f39c80f6f7e3c71566e65caabd52afa175d5/src/org/datafoodconsortium/connector/codegen/php/static/src/Connector.php#L17-L22

And in the Ruby static code here:

https://github.com/datafoodconsortium/connector-codegen/blob/c177f39c80f6f7e3c71566e65caabd52afa175d5/src/org/datafoodconsortium/connector/codegen/ruby/static/lib/datafoodconsortium/connector/context.rb#L17-L22

I gather these are fulfilling the same goal as context.json in the original TypeScript repo, in which case, would it make sense to just import the JSON from https://github.com/datafoodconsortium/ontology/releases/latest/download/context.json ?

I'll try replacing those as best as I can figure out and push it on top of your two latest commits (or perhaps a separate fork, or both), that way you can see the diffs and it should be more clear.

jgaehring commented 6 months ago

Here's the commit with the substitutions I made for static.datafoodconsortium.org: jgaehring/connector-codegen@92f8218

jgaehring commented 6 months ago

I'm having a devil of a time with the tests. There are likely other issues (see * below), but a major hindrance seems to be related to the Jest test runner itself and using ECMAScript Modules instead of CommonJS, which seems linked to use of the NODE_OPTIONS=--experimental-vm-modules environment variable. This seems to be blocked by an open issue in Node itself. It's been open for a few years and there are some workarounds, but I'm still not sure I see what would work for our case, if that even is the problem. Here are the relevant issues:

Here's what the error message looks for one of the tests. Note it refers to a line 698, which is line 533 in the source, from the jest-runtime library.

 FAIL  test/Address.test.js
  ● Test suite failed to run

    request for './body.js' is not yet fulfilled

      at Runtime.linkAndEvaluateModule (node_modules/jest-runtime/build/index.js:698:5)

ReferenceError: You are trying to `import` a file after the Jest environment has been torn down. From test/Address.test.js.

Have you encountered anything like this, @lecoqlibre?

* = The other issue I referred to is an error that reads "IRIs can not be mapped to other IRIs", which I suspect is related to some of our previous discussions around IRIs, but I'm hesitant to begin diagnosing that problem until the issues with Jest are resolved. As it is, the IRIs error only appears sporadically, depending on whether the issue with Node ESM imports throws an exception or not, which is not ideal for debugging.

lecoqlibre commented 6 months ago

Have you encountered anything like this, @lecoqlibre?

Yes absolutely, same here. I encountered this issue in the previous version of the TS connector. I think I found a workaround but don't know if it can work for the new version and the evolution of Jest. Are you familiar with other test framework that we might use instead of Jest?

The other issue I referred to is an error that reads "IRIs can not be mapped to other IRIs"

I think it's related to the prefixes we are now using in the connector. I'm working on the Semantizer to add a setContext method like you suggested (instead of setPrefix).

jgaehring commented 6 months ago

Are you familiar with other test framework that we might use instead of Jest?

Actually, I've just about got the native Node.js test runner working. It was introduced with Node v.18 (LTS) back in 2022, so it's stable.

I was about ready to say I fixed it, but I wanted to try it out with a clean build of the generated code, and now I'm seeing some weird syntax errors in the generated TypeScript. I haven't changed anything on the Acceleo side, just the static files in typescript/static/test/ and the static package.json file. I'll report back when I figure it out, hopefully soon! :crossed_fingers:

jgaehring commented 6 months ago

I wanted to try it out with a clean build of the generated code, and now I'm seeing some weird syntax errors in the generated TypeScript.

Ohhh, I think this might be related to when I deleted the bin/ directory, which I did without thinking when I deleted the gen/. I've walked back my git history one-commit-at-a-time to b39cbde, which I know was working for me, both yesterday and this morning, but I'm still getting those weird syntax errors in the generated code. Now it seems that whenever I delete the binaries, the next time I run the generator, it exits before any of the TypeScript (or PHP or Ruby) has a chance to build. I just get this error in the Eclipse console:

Error: Could not find or load main class org.datafoodconsortium.connector.codegen.Generate
Caused by: java.lang.ClassNotFoundException: org.datafoodconsortium.connector.codegen.Generate

However, it has rebuilt the binaries after that first try, so then when I run the generator after that, it completes the process and generates the connector files, but with tons of syntax errors in the generated code. There are also some new errors in the Acceleo console, which may be related to that initial ClassNotFoundException, unless they were always there and I just never noticed:

Couldn't find class org.datafoodconsortium.connector.codegen.Common in the classpath.
java.lang.ClassNotFoundException: org.datafoodconsortium.connector.codegen.Common
    [followed by long stacktrace...]

org.eclipse.acceleo.engine.AcceleoEvaluationException: Class org.datafoodconsortium.connector.codegen.Common Couldn't be found in the classpath of the bundle containing module queries.emtl.
    [followed by long stacktrace...]

So I guess this is something I broke with my config when I deleted those binaries the first time? And for some reason it's not able to rebuild those binaries without error when I try to start over with a clean slate?

I need to step away for a little bit, but in the meantime, I'm pushing two commits that swap out the Jest test runner for the native Node test runner. Maybe if you pull those down, you can get them to work for you?

jgaehring commented 6 months ago

Oh, and I've got one more commit I'm not pushing onto this branch just yet, b/c I didn't want to muddy the waters any further, but I removed some unnecessary async keywords, since they might interfere with the test runner or cause it to time out or something unexpected. Probably a best practice for either Jest or the Node test runner. Here are the diffs:

https://github.com/jgaehring/connector-codegen/compare/typescript...jgaehring:connector-codegen:async-tests

lecoqlibre commented 6 months ago

My last commit edaf89f integrates the dev (alpha.3) version of Semantizer. I changed the Semantizer dependency type to a GitHub repo, this way we can run npm update @virtual-assembly/semantizer every time changes are made.

@jgaehring I also updated dependencies of rdf-ext.

Also, note that we should use the UML model from the next branch of the data-model-uml repo when generating.

I merged your commits about using node:test and without some unnecessary async. I got all the tests passing except the ones for the import feature and the remover methods. For now we should comment the remover tests and expect Exceptions. I'm looking at how to fix the import tests. I think it's related to the context.

As the Semantizer has now a dedicated class which act like an environment, we should move in the future some components from the connector to the Semantizer (like I did for the PHP version). Especially import/export, store, factories.

lecoqlibre commented 6 months ago

So far we get all the tests passing!

Some tasks remain:

@RaggedStaff We should check that the transformation loop suits the needs. Have we some use cases from the users? I did a simple test in the PlannedTranformation test file. I have trouble to see for instance how we can do a combine transformation as the ConsumptionFlow takes only one DefinedProduct as input.

jgaehring commented 5 months ago

I've got a start at the test generation (2c8507a). So far I'm just testing the constructor. There are a total of 109 tests with 99 passing and 10 failing. I've put the relevant tests and the full results up on this gist:

https://gist.github.com/jgaehring/bde27fe6991a289cc7016ed77bc1120e

There seem to be three kinds of errors that I'm not yet sure how to resolve:

  1. Duplicate import statements, as in Enterprise.test.js, or missing imports, as in QuantitativeValue.test.js
  2. Unknown ontology type in PlannedConsumptionFlow.test.js, which may be related to the Flow superclass or the ConnectorFactory, based on the stacktrace.
  3. A weird issue where a plain string ends up as the test value for images in SuppliedProduct.test.js, which results in a images.forEach is not a function error, b/c it should be expecting an array of strings. The generated PHP tests also have a plain string for the images property, so maybe this is something to do with how the test values are generated?

I have to put this down for now, might not be able to pick it back up til later this week. It's taken me a bit longer than anticipated (what else is new), but if these issues can be knocked out, hopefully tests for the getters and setters will follow in short order. :crossed_fingers: