jeff-dh / SolidPython

A python frontend for solid modelling that compiles to OpenSCAD
137 stars 24 forks source link

Add testing for extension libraries (starting with BOSL2) #23

Closed ssuchter closed 1 year ago

ssuchter commented 1 year ago

@jeff-dh mentioned that it's not possible to simply run autogeneration of BOSL2 python stubs, since the generated code sometimes needs manual tweaks before it works correctly.

In the long-term service of making things better, I think it'd be nice to add some automated tests of whether the python wrappers around the 3rd party code is doing the right thing. I'm imagining starting with targeted tests, not comprehensive. I could imagine a few steps, we can do some or all of them:

  1. Have some python test code that utilizes solid2.bosl2, call it to generate .scad files. If the .scad files cannot be generated, that's a fail.
  2. Call OpenSCAD (command line version) to actually render the output of step 1
  3. Write some scad code that directly uses BOSL2, and utilize OpenSCAD to determine the volumetric difference between the solids generated from scad code vs python code. Below a small threshold (to account for float jitter) is success.
ssuchter commented 1 year ago

@jeff-dh I wrote the above, hoping to start a conversation with you on whether you think it's a good/helpful idea or not.

jeff-dh commented 1 year ago

Hmmmmmm ;)

In general I'd like to keep the code base as small as possible. This has quite some priority for me.

Since the generated python stubs are really only python stubs which create a openscad call with the same name, I think there's not a lot that could go wrong at all -- at least not if you're changing the stubs.

All the issues I had to manually tweak were -- as far as I remember -- issues where there were multiple parameters wit the same name. Like:

def foo(a, b=None, c=None, b=None)

All the errors appeared when I imported the files as far as I remember.

I'm actually not sure whether I checked all files, maybe only the std ones? I possibly also utilized pyright.

So this is / was actually no big deal (as long as we update the bosl2 lib every once in a while). And I don't really like the idea of extra testing code if there's no real need.

Btw:

I'm wondering whether the two of us are looking at the bosl2 lib from different perspective.

For me the bosl2 lib (which is part of the solidpython2 package itself) is supposed to be kind of a standard library for solidpython2. Its actually a replacement for solidpython(1).utils. As this I'm actually afraid that updating bosl2 all the time causes breaking changes. I don't want to say that we should stick to a certain version forever, but I'm not sure how to handle this issue at the moment. I was thinking whether updating bosl2 with a breaking change could be treated as a minor version change (which would be technically not correct, but.... I dont want a major version change because of a bosl2 update) and we could update it every once in a while. But that's also not really cool. What if you have a code base that pins you to a certain bosl2 version but would like to have the latest bugfixes?

Another though is whether it would make sense to create a solidpython2 extension that could be kept -- as automatically as possible -- up to date with the latest bosl2 master. This would have a version that's independent from solidpython2. As such you can get the latest solidpython2 bugfixes while sticking to a certain bosl2 version. Furthermore there could be more regular updates, more sophisticated automation logic and so on without interfering with the solidpython2 core.

Do you know about solidpython2-extensions? Something like that. solidpython2-bosl2 as a seperate package / extensions. With that we could really pin the bosl2 version to a certain version -- and let it sit there as a std lib for quite some time -- and we have a easy installable and usable package that would include the latest bosl2 lib. We could actually even think about removing the bosl2 lib entirely from the solidpython2 repo in the long term.

https://github.com/jeff-dh/solidpython2_extensions https://github.com/jeff-dh/solidpython2_legacy (there are pypi packages for both)

(btw: reintegrating such an extension into the solidpython2 repo should be easy if necessary)

These are all just thoughts for now. I need to think about it for some more time and I'm happy to have a discussion about it.

ssuchter commented 1 year ago

Nice discussion!

If the only known issue is the doubled arguments, isn't that an argument that something ought to be fixed in the guts of bosl2_generator? I could totally believe that the thing to do is to fix that, and we don't need testing (yet).

I still wonder about the lightest weight of testing, though - just ensure the code actually executes for at least one bosl2 call and generates scad code. My thought here is to have the lightest possible smoke test that will prevent a completely failed release. E.g. this is something that could be done in 1-2 new short files in the repo.

Also, on the BTW stuff: I think the ideas of putting the library in a side project or putting it in the main project, or updating frequently vs occasionally, are both orthogonal to the idea of validation.

jeff-dh commented 1 year ago

Yeah, fixing the double parameters might be cool! I hesitated with it back than because of blablub....

I also thought about the "leightest weight of testing". One question than is whether it should be an overall library task (there are no test at all for solidpython2 yet). At the same time I'm a little afraid of it because the solidpython(1) tests were more in the way than helping when I extensively worked on solidpython2. But yeah.... I might be open to at least give it a try as long as it somehow stays light and does not test to much in detail. In other words somehow finding a way (test) to test the basic overall functioning with as little code as possible might be cool.

Right now some very very light smoke of test (that I use to check whether there are no really big issues) is to run solid2/examples/run_all_examples.py. But yeah, it would be nice to check whether the resulting openscad code renders (correctly?). To be honest I'm really not into testing and there's a reluctance towards it -- probably for "historical reasons" and it might not make sense.... if you're motivated to propose some test(s) that "don't get in the way" that might be interesting.

Hah, maybe this could help us: I just remembered that there's this function: ObjectBase.render_as_stl(...) (it's actually RenderMixin in the latest head). It should return a path if some object renders successfully with openscad and None if it does not. Maybe this could be integrated into the examples somehow and thus provide basic testing....? Something like adding assert(mainObject.render_as_stl() != None) as last line of each example? Maybe this could be done from outside the examples itself, somehow importing the examples into a test suite (if not __name__ == "__main__": assert(...)?). We might also check the results against previously generated outputs.

The reason why those questions are not totally orthogonal from my point of view is, because if you would work on an external bosl2 library, I would just let go. You could go for what ever you want and we could discuss afterwards whether we'd like to reintegrate it into the main repo or leave it as separate repo. How -- for example -- the generation is triggered (-> poetry build hook) or whether there are tests could be completely up to you. On the other hand if you like the spend some time thinking about a basic test for the whole library, that would be cool too!

Again, these are all just rough thoughts about it, feel free to develop them further and make suggestions if you like to.

jeff-dh commented 1 year ago

Just to let you know, I got a 90% working version of my proposal in a local copy, I'll try to finish it and commit it soon.

jeff-dh commented 1 year ago

aaah, here's my proposal (forgot to "submit" this earlier)....

I would suggest the following:

Alternatively we could omit the copy of the generated scad files in tests/example_results and utilized git diff to check whether run_all_examples produces different output than before (expected). I think I would actually even prefer this.

I guess this would give us quite some test coverage while adding very little code. It would check that all the examples produce the output we expect and further more checks that the output is openscad renderable (-> the includes are ok). Furthermore this does not get in the way a lot. If there's a change to solidpython that deliberately produces a different output we can easily check it (-> git diff) and only need to update the corresponding .scad file in the example directory to "commit" those changes for testing to be updated. We would not need to adjust the test(s) itself at all.

Any thoughts about it? Are there any cases you'd like to test that would not be covered by this?

jeff-dh commented 1 year ago

if you're interested have a look at https://github.com/jeff-dh/SolidPython/tree/examples_unit_test

What do you think about it? It caught all my test cases and even found a "real" bug in an example......

ssuchter commented 1 year ago

I definitely agree that we want to be appropriately lightweight, and your approach is pretty similar to what I imagined.

No concerns at all with the checking the output of a call to openscad -o.

I'm a little worried about just diffing the output'd scad files, that we'll spend a lot of time validating spurious differences. I suppose we can worry about this if it becomes a problem.

ssuchter commented 1 year ago

I tried to run this, and I noticed that 07-libs.py doesn't run:

Traceback (most recent call last):
  File "/Users/ssuchter/src/ssuchter-SolidPython/solid2/examples/./07-libs.py", line 14, in <module>
    bosl.metric_screws = import_scad("BOSL/metric_screws.scad")
  File "/opt/homebrew/lib/python3.10/site-packages/solid2/core/scad_import.py", line 113, in import_scad
    load_scad_file_or_dir_into_dict(filename, dest_namespace.__dict__, use_not_include, skip_render)
  File "/opt/homebrew/lib/python3.10/site-packages/solid2/core/scad_import.py", line 71, in load_scad_file_or_dir_into_dict
    raise ValueError(f'Could not find .scad file {filename}.')
ValueError: Could not find .scad file BOSL/metric_screws.scad.

I guess we should remove it, since it's been replaced with 07-libs-bosl2.py?

jeff-dh commented 1 year ago

Hmmm as the comment a couple lines above notes:

# [...]This examples needs the bosl
# library to be installed in the OpenSCAD library path
# (~.local/share/OpenSCAD/libraries)

I would propose that we exclude it from the tests. I've done the same for the implicitCAD examples. I would like to have a look how we can easily exclude certain examples from test. Atm it's hard coded in the test, I'd like to have some kind of "decorator" for each example which would exclude it from the test.

-> To be fixed

That example has a different purpose as the other 07- examples. It's supposed to demonstrate how to "import" any openscad code / library. Furthermore the numbers are not optimal for all the 07- examples. That has historical reasons. All the 07- examples could be cleaned up somehow.

jeff-dh commented 1 year ago

I'm a little worried about just diffing the output'd scad files, that we'll spend a lot of time validating spurious differences. I suppose we can worry about this if it becomes a problem.

I don't think this is gonna become a problem. Since the scad files are generated they don't change unless you either change the input (the example itself) or you intentionally (feature) or not (bug) introduce a change which causes a different output. In both cases I think it's nice to be informed about it and you can simply update the tests once (-> commit the diff to git).

ssuchter commented 1 year ago

Sure. Would you like any help here? I didn't file the issue necessarily expecting you to do it; I was really just creating a discussion.

I definitely buy the "decorate a test" methodology, but the tests are currently all structured as standalone scripts, as opposed to methods or classes that easily take decoration. We could put a special comment or no-op piece of python code into tests that shouldn't contributed to the examples output, and make run_all_examples.py look for this and not invoke the script - is this what you imagined?

I bet we can also cause the tests to output into a specific directory by making run_all_examples.py set the CWD to the target directory.

jeff-dh commented 1 year ago

I actually really like this kind of test (the diff test) because:

I see SolidPython actually as a very funny kind of transpiler. SolidPython transpiles python code to openscad (under certain circumstances). As such the transpiler is supposed to generate exactly the same output for the same input. And as soon as the output changes it is either intentionally or a bug.

PS: it get's even weirder. the import_scad mechanism of SolidPython is actually something like a openscad to python transpiler :D

ssuchter commented 1 year ago

Yeah, I'm happy to try implementing SolidPython test with the diffing method.

jeff-dh commented 1 year ago

Yeah, the best solution I came up yet is to utilize the filename as decorator and check it against a regular expression in the unittest / "run_all_examples". Something like 01-first.py vs 02-second.excludeTest.py or similar, but I couldn't decide for a "suffix/prefix/whatever" yet. Good suggestions?

I also thought about the "cwd-issue" but was not motivated to get into it. This was easier and good enough for now. But if there's a simple way to achieve it, I'm very happy to replace the copy call with it. It's more like brain storming, trying an idea and woop there's 90% of a solution. The brain storming is the real work ;) Your welcome to contribute ideas, discussions, code and pick up code, rewrite it, propose alternatives, ... what ever you like.

PS: Another idea from the back of my head, do you think it would be nice to have some kind of watch dog util that re-executes a solidpython script each time it changes? In combination with openscads auto-reloading it would enable a working flow where saving a file would trigger openscad to reaload automatically. I don't really get a grip onto that idea, if you have any ideas.....

ssuchter commented 1 year ago

A question about the 'git diff' based methodology (which I do like for this) - would we want something to fail if there are diffs? For example, would we want 'poetry build' to fail? (That feels right to me.)

The workflow would be:

  1. Make some changes to code
  2. Notice that the tests fail with a diff, either because you ran 'poetry build' or a separate way to run the tests
  3. Fix the failure either by changing code or by committing the generated scad file changes
  4. Tests pass/poetry build succeeds
jeff-dh commented 1 year ago

hmmmmm

If we would want to check something, I think we would need to check whether the tests pass and that would introduce a delay to a poetry build call of several seconds. I don't think that's worth it. I'm fine with triggering the tests manually.

Simply run the test before you commit / push.

I was wondering whether there's a way to add "peotry-commands". Like:

poetry generate_bosl2
poetry generate_openscad_extension ./path/to/scad ./path/to/py
poetry test
...?

This would give us a central place to invoke certain "scripts" from a common interface

jeff-dh commented 1 year ago

Yeah, I'm happy to try implementing SolidPython test with the diffing method.

what do you mean by this? Is this the same I did in https://github.com/jeff-dh/SolidPython/tree/examples_unit_test ?

jeff-dh commented 1 year ago

ah, I just rember, I don't think letting the examples output to a different directory is easy possible, because the save_as_scad call uses __file__ under the hood to determine the output file. Since I don't want to adjust the examples to the testing framework I think copying them is the best way to go.

jeff-dh commented 1 year ago

I added the changes to the examples_unit_test branch and would probably merge it into master soon if you're fine with this proposal.

ssuchter commented 1 year ago

I looked through the code in the examples_unit_test branch, LGTM.

ssuchter commented 1 year ago

Re: "poetry commands" - How about I start by making "poetry run pytest" do the right thing?

jeff-dh commented 1 year ago

did you see #27 ?

I'll close this one, it's merged.

I think "poetry commands" would be nice if we could get a common interface for all scripts, but since peotry run does not accept parameters, we can't.... so hmmmm