VUnit / vunit

VUnit is a unit testing framework for VHDL/SystemVerilog

http://vunit.github.io/

Other

739 stars 263 forks source link

Let VUnit execute simulators which are inside a container or VM, from outside #324

Closed eine closed 6 years ago

eine commented 6 years ago

Because VUnit requires the simulator to be available in the path, and executable names (vsim, vcom, etc.) are 'hardcoded', VUnit must be installed in the same docker image as the simulator. In a use case, I have two images, one with GHDL and a different one with QuestaSim, and VUnit is installed in both of them. Indeed, because I have versions with and without VUnit, there are four images. Each time VUnit is updated, two of them have to be rebuilt.

My proposal is to make VUnit execute simulators which are inside a container, from outside:

Have a local python installation, but do not install simulators in the host.
- This would allow you not to compile GHDL in each test, but still test multiple python versions with the latest GHDL in travis.
Build a single VUnit docker image that can interact with simulator containers.

In the latter, I would have three images instead of four, and I would have to rebuild a single one. Note how this would scale with more simulators. Furthermore, if the VUnit docker image was generated in the vunit travis-ci flow, I would consume it as an alternative to ghdl/ext:vunit.

Is it possible to prepend (not append) args to every vsim, vcom, ghdl call?
Is it possible to hook commands (to create/kill temporal containers) before and after each 'task'? By 'task' I mean a sequence a commands to be executed in the same container.

A workaround would be to 'hack' it at system level. That is, write some man-in-the-middle shell scripts to get VUnit's calls and forward them to containers. However, it is not clean, because it is not straighforward to handle the lifecycle of tests as VUnit does.

@kraigher (Gitter): @1138-4EB Regarding your use case about customizing the command for running vsim etc. it has some overlap with the use case of running a batch server farm with bsub if you are familiar with that. Anyway such things are easier to discuss in an issue rather than in this chat.

eine commented 6 years ago

@kraigher , I've checked bsub and, although similar, it is overkill. Because it is meant for HPC with multiple physical devices, an additional orchestration service (LSF) must be run. My proposal is more straighforward because all the containers are expected to be executed in the same machine, so no additional service is required.

Well, docker is the service, somehow, as docker exec works for both local containers and services in a swarm cluster. But I'll focus on the local case, which is easier to follow. Four functionally equivalent examples are shown below:

All the commands at the same time with local tools.
All the commands at the same time in a temporal container (same as #323, but without VUnit).
All the commands at the same time in a named container running in the background.
Commands one-by-one in a named container running in the background.

I want to achieve 2, 3 or 4 with VUnit. I will focus on 4, as I think it is the less invasive approach from VUnit's perspective:

Create the container and keep it running in the background, before calling VUnit or as a pre-task hook.
Intercept any ghdl <GHDL_ARGS> call from VUnit and decorate as docker exec -t -w <PATH> <CONTAINER_NAME> bash -c 'ghdl <GHDL_ARGS>'.
Stop and remove the container when VUnit exits or as a post-task hook.

That's quite easy to achieve, as long as all the sources and products are kept in the single folder that is shared between VUnit and the container with GHDL. However, VUnit requires additional external sources to be built. E.g., for example/vhdl/array_axis_vcs more than 100 files from usr/local/lib/python3.6/site-packages/vunit_hdl-3.0.3rc0-py3.6.egg/vunit/vhdl/ are compiled. There are two solutions:

Share two folders between VUnit and the GHDL container: the project folder and the installation path of VUnit sources.
Share only the project folder and make VUnit copy the required sources there.

Unless you tell me not to do so, I'd go with the first option, because copying the same sources to every project makes little sense to me.

As a result, the command 'decoration' is required to not only prepend and append some args, but also to rewrite the two paths that might appear in <GHDL_ARGS>.

Interception/decoration script

Shall all be done without any modification to VUnit, this can be a local prototype of the interception script:

$ mv /usr/local/bin/ghdl /usr/local/bin/ghdlbin
$ vim /usr/local/bin/ghdl

#!/bin/sh
/usr/local/bin/ghdlback $@

:wq

In order to see the exact commands that are executed:

#!/bin/sh

if [ "$1" != "--version" ]; then
  echo "$@"
fi

/usr/local/bin/ghdlback $@

and we get:

-a --workdir=/work/vunit_out/ghdl/libraries/vunit_lib --work=vunit_lib --std=08
-P/work/vunit_out/ghdl/libraries/vunit_lib -P/work/vunit_out/ghdl/libraries/osvv
m -P/work/vunit_out/ghdl/libraries/lib /usr/local/lib/python3.6/site-packages/vu
nit_hdl-3.0.3rc0-py3.6.egg/vunit/vhdl/check/src/check.vhd
Compiling into vunit_lib: ../usr/local/lib/python3.6/site-packages/vunit_hdl-3.0
.3rc0-py3.6.egg/vunit/vhdl/array/src/array_pkg.vhd
  passed
-a --workdir=/work/vunit_out/ghdl/libraries/vunit_lib --work=vunit_lib --std=08
-P/work/vunit_out/ghdl/libraries/vunit_lib -P/work/vunit_out/ghdl/libraries/osvv
m -P/work/vunit_out/ghdl/libraries/lib /usr/local/lib/python3.6/site-packages/vu
nit_hdl-3.0.3rc0-py3.6.egg/vunit/vhdl/array/src/array_pkg.vhd
Compiling into lib:       src/test/tb_axis_loop.vhd
  passed

--elab-run --std=08 --work=lib --workdir=/work/vunit_out/ghdl/libraries/lib -P/w
ork/vunit_out/ghdl/libraries/vunit_lib -P/work/vunit_out/ghdl/libraries/osvvm -P
/work/vunit_out/ghdl/libraries/lib tb_axis_loop tb -gtb_path=/work/src/test/ -gr
unner_cfg=active python runner : true,enabled_test_cases : test,output path : /w
ork/vunit_out/test_output/lib.tb_axis_loop.test_260bb5c8c675e898eca5dc9024a4420e
de12c0bc/,tb path : /work/src/test/,use_color : true --assert-level=error

that we can rewrite as:

-a --workdir=${WORKDIR}/vunit_out/ghdl/libraries/vunit_lib --work=vunit_lib --std=08
-P${WORKDIR}/vunit_out/vunit_lib -P${WORKDIR}/vunit_out/ghdl/libraries/osvv
m -P${WORKDIR}/vunit_out/lib ${VUDIR}/vhdl/check/src/check.vhd
Compiling into vunit_lib: ..${VUDIR}/vhdl/array/src/array_pkg.vhd
  passed
-a --workdir=${WORKDIR}/vunit_out/ghdl/libraries/vunit_lib --work=vunit_lib --std=08
-P${WORKDIR}/vunit_out/ghdl/libraries/vunit_lib -P${WORKDIR}/vunit_out/ghdl/libraries/osvv
m -P${WORKDIR}/vunit_out/ghdl/libraries/lib ${VUDIR}/vhdl/array/src/array_pkg.vhd
Compiling into lib:       src/test/tb_axis_loop.vhd
  passed

--elab-run --std=08 --work=lib --workdir=${WORKDIR}/vunit_out/ghdl/libraries/lib -P${WORKDIR}/vunit_out/ghdl/libraries/vunit_lib -P${WORKDIR}/vunit_out/ghdl/libraries/osvvm -P
${WORKDIR}/vunit_out/lib tb_axis_loop tb -gtb_path=${WORKDIR}/src/test/ -gr
unner_cfg=active python runner : true,enabled_test_cases : test,output path : ${WORKDIR}/vunit_out/test_output/lib.tb_axis_loop.test_260bb5c8c675e898eca5dc9024a4420e
de12c0bc/,tb path : ${WORKDIR}/src/test/,use_color : true --assert-level=error

where

WORKDIR="/work"
VUDIR="/usr/local/lib/python3.6/site-packages/vunit_hdl-3.0.3rc0-py3.6.egg/vunit"

So we can either hardcode or use regexps to replace those with MYWORKDIR and MYVUDIR, that correspond to the paths inside the container. But it would be cleaner if we could instruct VUnit to use MYWORKDIR and MYVUDIR only for external calls. Note that it should keep using WORKDIR and VUDIR internally.

If that was supported, examples 2, 3 and 4 could be easily achieved replacing /usr/local/bin/ghdlback $@ with docker exec -t -w /src ghdl-sim bash -c "ghdl $@".

I might also have overlooked the docs. If this is already supported, please let me know.

Examples: from local to `docker run` to `docker exec`

git clone https://github.com/1138-4EB/hwd-ide
cd hwd-ide/examples/full_adder

#---
#--- All the commands at the same time with local tools
#---

mkdir -p output
cd output
ghdl -a ../hdl/*.vhd
ghdl -r adder_tb --vcd=adder.vcd
cd ..

ls -la output
rm -rf output

#---
#--- All the commands at the same time in a temporal container
#---

docker run --rm -t \
  -v /$(pwd)://src \
  -w //src \
  ghdl/ghdl:stretch-mcode \
  bash -c '\
    mkdir -p output && \
    cd output && \
    ghdl -a ../hdl/*.vhd && \
    ghdl -r adder_tb --vcd=adder.vcd\
  '

ls -la output
rm -rf output

#---
#--- All the commands at the same time in a named container running in the background
#---

# First, create the container and keep it running in the background
docker run --name ghdl-sim -d \
  -v /$(pwd)://src \
  ghdl/ghdl:stretch-mcode \
  tail -f /dev/null

# Then, exec commands inside
docker exec -t \
  -w //src \
  ghdl-sim \
  bash -c '\
    mkdir -p output && \
    cd output && \
    ghdl -a ../hdl/*.vhd && \
    ghdl -r adder_tb --vcd=adder.vcd\
  '

# Last, stop and remove the container
docker rm -f ghdl-sim

ls -la output
rm -rf output

#---
#--- Commands one-by-one in a named container running in the background
#---

# First, create the container and keep it running in the background
docker run --name ghdl-sim -d \
  -v /$(pwd)://src \
  ghdl/ghdl:stretch-mcode \
  tail -f /dev/null

# Then, run each command in its own exec
docker exec -t -w //src ghdl-sim bash -c '\
  mkdir -p output \
'

docker exec -t -w //src/output ghdl-sim bash -c '\
  ghdl -a ../hdl/*.vhd \
'

docker exec -t -w //src/output ghdl-sim bash -c '\
  ghdl -r adder_tb --vcd=adder.vcd \
'

# Last, stop and remove the container
docker rm -f ghdl-sim

ls -la output
rm -rf output

eine commented 6 years ago

I forgot to mention that Docker provides a Python SDK: https://docs.docker.com/develop/sdk/ It lets you do anything the docker command does, but from within Python apps. So this feature can be embedded into VUnit if wanted, instead of relying on external scripts.

kraigher commented 6 years ago

Spontaneously it seems like a lot of complexity just to avoid installing VUnit in the container.

I infer that your problem is that you want to bumb the VUnit version often and it creates the need for many containers. Wouldn't the easiest solution just be to just run VUnit from within the container but have the VUnit repo outside of the container. As long as VUnits only dependency (colorama) is installed in the container you can just set the PYTHONPATH to the VUnit repo and map the repo root to the container to run it without any "install". When I run VUnit i never install it with setup.py or pip I just set the PYTHONPATH in the shell.

eine commented 6 years ago

The main point is not to avoid installing VUnit (which is only 6MB, including the docs), but to avoid installing Python:

debian:stretch-slim, which is the base I use, is 55.3 MB.
python:slim-stretch: 141.5 MB.

So, installing python in each container requires four times as much space for the base image. In practice, this is the difference:

ghdl/ghdl:stretch-mcode 189.3 MB.
ghdl/ext:vunit-master 304.6 MB.

Moreover, if external, VUnit itself could be executed in a python:3.6-alpine3.7 container, that requires 'only' 83.4 MB (a half of the Debian version). Ideally, GHDL would also be distributed based on some alpine image, but unfortunately, GHDL cannot be compiled for alpine.

A different point, which I don't know how relevant might be, is that ghdl/ext images are based on a single python version. So different images are required if other python versions are needed. For example, right now it is hard to integrate docker in the travis matrix of vunit.

kraigher commented 6 years ago

Some random thoughts:

When running VUnit the container would have to live for the entire run of the run.py file. In the case of GHDL the only binary that VUnit calls is ghdl thus it would be quite small work to just make a man in the middle script and add a wrapper script for running VUnit with container setup and teardown arround the run.py call.
The user would expected that all paths on the local file system would be visible to the docker container. A source file or input data file could be located anywhere.
When running locally outside of CI environment the user would expect to launch simulator GUI or gtkwave.
Commercial simulator installs are often > 1 GByte anyway so the space savings would not as significant.
Does users of VUnit really care about multiple Python versions? It might only be a use case for the VUnit maintainers to run tests with different Python versions.

eine commented 6 years ago

When running VUnit the container would have to live for the entire run of the run.py file.

This can be easily met.

In the case of GHDL the only binary that VUnit calls is ghdl thus it would be quite small work to just make a man in the middle script and add a wrapper script for running VUnit with container setup and teardown arround the run.py call.

What about path translation/modification? Can VUnit be instructed to introduce two keywords (prefixes) to make regexp easier?

The user would expected that all paths on the local file system would be visible to the docker container. A source file or input data file could be located anywhere.

This can be done by sharing the root of the hard drive with the container. But that's not suggested at all, and it does not make much sense considering the user is using a container or virtual machine. I mean, if the user cannot keep all the files inside a project folder (or copy the external ones before calling VUnit), then I think that using a container or virtual machine is not an option.

When running locally outside of CI environment the user would expect to launch simulator GUI or gtkwave.

There are two requirements:

Have an X server running on the host. This is not a problem in GNU/Linux. For Windows, Xming, Cygwin/X or VcXsrv can be used.
Install X dependencies, icons, etc. in the container.

Then, this wrapper script automatically sets the required environment variables (tested in GNU/Linux and Windows). Indeed, it uses xdpyinfo to test if there is any X server available; if it is not, it checks if it is a MINGW environment (Windows) and prompts the user for confirmation to initialize Xming. See 1138-4EB/hwd-ide/wiki/Desktop#windows.

I use this script very frequently (everyday) with QuestaSim docker images and with ghdl/ext:vunit-gtkwave (GHDL + VUnit + GtkWave). Demo.

BTW, this point does nothing to do with this issue. I mean, this is an option right now, and it would also be if any change was made.

Commercial simulator installs are often > 1 GByte anyway so the space savings would not as significant.

I hope to be able to build GHDL for ARM at some point, so thinking about low-cost target devices size matters. However, I agree with you: this is a very specific use case in the VUnit user base.

Does users of VUnit really care about multiple Python versions? It might only be a use case for the VUnit maintainers to run tests with different Python versions.

I cannot help here. I do not. But I am not a Python user, so just touch the minimum. I do not really know what the differences between versions are.

kraigher commented 6 years ago

What about path translation/modification? Can VUnit be instructed to introduce two keywords (prefixes) to make regexp easier?

Path translation is complex. There are a lot of paths everywhere. paths in .tcl files, path given on argv to the simulation. paths given as generics to the test bench from the user. However the recommendation is for the user to create paths in the run.py relative to the project folder or run.py and that the project folder is pointed to relative to the run.py file. Thus running in docker would not need any complex path translation since all paths where created relative to the run.py file.

The way I see it you can already achieve what you want by doing to things.

Create ghdl in fakeghdl containing:
```
docker exec ghdl $ARGS
```

Create run.sh containing:

export VUNIT_GHDL_PATH=fakeghdl/
setup_docker.sh
python run.py $ARGS
teardown_docker.sh

eine commented 6 years ago

That's what I did above. See section 'Interception/decoration script'.

After your explanation I suppose that WORKDIR is always going to be the path where run.py is called from, so I can just set an environment variable from run.py. About VUDIR="/usr/local/lib/python3.6/site-packages/vunit_hdl-3.0.3rc0-py3.6.egg/vunit", how do you get it programatically?

kraigher commented 6 years ago

About VUDIR="/usr/local/lib/python3.6/site-packages/vunit_hdl-3.0.3rc0-py3.6.egg/vunit", how do you get it programatically?

python -c "import vunit;import os;print(os.path.abspath(os.path.dirname(vunit.__file__)))"

eine commented 6 years ago

Thanks! I'll give it a try tonight!

kraigher commented 6 years ago

But if you do not "install" VUnit using pip or setup.py and just keep it in a folder next to the project you can always map that into docker and set the PYTHONPATH

eine commented 6 years ago

Well, everything we commented works as expected:

ghdl
run.sh

Running run.sh examples/vhdl/array/run.py -v is equivalent to python examples/vhdl/array/run.py -v and all the sources are properly compiled (lines 519-584 in this log).

However, it seems that python is a runtime dependency for GHDL itself if VUnit is used to run the design (lines 292 and 297 of the log). I suppose it is related to src/vhdl/python. Does VUnit require simulators to provide a python interface or is it used only when available, for some "advanced" features?

kraigher commented 6 years ago

VUnit does not use any Python feature of GHDL. We only call the binary. I have seen GHDL recently add Python API but VUnit has no plans to use that.

eine commented 6 years ago

Then, I'll have to investigate. I think that the script might be removing some symbols. This is the command producing the error:

ghdl \
--elab-run \
--std=08 \
--work=lib \
--workdir=/work/examples/vhdl/array/vunit_out/ghdl/libraries/lib \
-P/work/examples/vhdl/array/vunit_out/ghdl/libraries/vunit_lib \
-P/work/examples/vhdl/array/vunit_out/ghdl/libraries/osvvm \
-P/work/examples/vhdl/array/vunit_out/ghdl/libraries/lib \
tb_sobel_x tb \
-gtb_path=/work/examples/vhdl/array/src/test/ \
-grunner_cfg=active python runner : true,enabled_test_cases : test_input_file_against_output_file,output path : /work/examples/vhdl/array/vunit_out/test_output/lib.tb_sobel_x.test_input_file_against_output_file_b1de5eca46714be73ca0390ba61a470c08d9bdc0/,tb path : /work/examples/vhdl/array/src/test/,use_color : true --assert-level=error

There should be quotes or double quotes in the last line, isn't it?

kraigher commented 6 years ago

Yes it seems like a quoting issue. VUnit does not rely on any shell and sets argv of the subprocess directly.

eine commented 6 years ago

Indeed, I had to put the quotes in the ghdl interface of VUnit, because as soon as those are received by a bash script the generics are split. See #327. I hope that this does not imply any problem in other use cases.

eine commented 6 years ago

Well, there is no way for a shell to receive unescaped arguments and forward those unchanged. Receiving them inherently splits them. https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/

In python there are built-in solutions to satinize the output argv, when the target is known to be a shell:

However, this implies that a minimal change needs to be done to VUnit, in order to let it know when should the output be escaped.

kraigher commented 6 years ago

There is no "splitting" of arguments going on between VUnit and your wrapper script. It is only when issuing a command in a shell where a tokenization of the input string into a list of strings (argv) occurs. When invoking a subprocess from a program in for example C or Python it is normal to invoke the executable directly without having a shell in the middle tokenizing a string by setting the argv directly. This is the most exact and precise method to provide the argv list of strings to the subprocess when calling it. If your wrapper script was itself a Python or C-program it would be trivial for that to in turn start GHDL as a subprocess using the exact same and unmodified argv. Anyway doing it from a shell script seems less obvious (More reason to avoid shell scripts in favor of real programming languages) but after googling I found the "$@" bash variable which seems to solve this problem.

ghdl "$@"

eine commented 6 years ago

Your are ok. If we had to just forward clean it might work. But we cannot test that, because paths need to be rewritten. Indeed, the objective of putting an intermediate wrapper is to modify something. As soon as echo $@ | sed or for a in $@ are used, spaces are interpreted as breaks.

Then, as you say, a different language must be used. Writting an additional program in C or Golang is of little help, because it should be compiled for each platform. Then, since VUnit is written in Python, the most sensible approach is to do it in VUnit:

if os.environ.get('VUNIT_SIMULATOR_DOCKER_IMG') is not None:
   for i in range(len(cmd)):
      cmd[i] = shlex.quote(cmd[i])

At this point, it makes sense to do string replacements in the same place:

if os.environ.get('VUNIT_SIMULATOR_DOCKER_IMG') is not None:
   vunit_dir=os.environ.get('VUNIT_DIR')
   for i in range(len(cmd)):
      cmd[i] = shlex.quote(cmd[i].replace(vunit_dir,'/vunit'))

Moreover, because VUNIT_DIR and PROJECT_DIR are already available in VUnit, I'm guessing how to reuse them, instead of reading the envvars. Indeed, if the replacement is done inside VUnit (which was my very first proposal in this thread), there is no need for escaping: "$@" will work.

kraigher commented 6 years ago

Firstly there is no such thing as a PROJECT_DIR in VUnit that is purely a user side invention if it exists.

Secondly I do not like replacing strings with unknown meaning just because they look like some path that in 99% of the cases should be replaced but not in the 1%. Thus this is not something I am keen on integrating into VUnit. If it should be integrated into VUnit it needs to be a 100% semantically correct operation. I am not against adding something to VUnit to support your use case but it needs to have a good separation of concerns and not be a hack.

Anyway the best language to write your wrapper script would be Python. You need Python anyway to run VUnit in the host.

# Sketch of how it could work
import sys
import subprocess
subprocess.call(["ghdl"] + [fix_and_replace_stuff(arg) for arg in sys.argv])

Maybe you could also use the docker API via python to launch GHDL. In that way there is no shell involved at all and everything is precise except path replacement which is not 100% semantically ok.

Regarding path substitution I think you can and should avoid it completely. To avoid it you just need to ensure that the vunit folder and project folder are identical in the docker image and the host. For the project folder that should be doable. For the vunit folder you can just place vunit next to the project folder and set the PYTHONPATH there and it should also be able to look the same to the host and image if the project could. For example if you have a /workspace/ in the host with /workspace/project/ and /workspace/vunit/ you can ensure that those paths are identical on the host and image.

eine commented 6 years ago

Firstly there is no such thing as a PROJECT_DIR in VUnit that is purely a user side invention if it exists.

Indeed, in the examples above it is the path where run.py is located. Not something embedded in VUnit, but a convention used in all the exercises.

Secondly I do not like replacing strings with unknown meaning just because they look like some path that in 99% of the cases should be replaced but not in the 1%. Thus this is not something I am keen on integrating into VUnit. If it should be integrated into VUnit it needs to be a 100% semantically correct operation. I am not against adding something to VUnit to support your use case but it needs to have a good separation of concerns and not be a hack.

On the one hand. I am not suggesting to include any of the latest changes. I considered this issue to be closable with "Spontaneously it seems like a lot of complexity just to avoid installing VUnit in the container". The conversation afterwards was just to let myself understand how large that complexity is. Moreover, it let both of us realize that a couple of shell scripts do not suffice.

On the other hand, I am not confident enough to push any of the python changes, now that I get to understand the implications. For now, I just want to keep a branch in my fork. Considering a possible future integration, these are some options I need to investigate:

Add some flag to change _get_sim_command and compile_vhdl_file_command in ghdl_interface:
Add a "new" simulator, ghdl_docker_interface that inherits from ghdl_interface. This would allow to implement the same, but make sure that it does not interfere at all with the "normal" usage. However, I don't know how complex it is to add a new simulator.
Provide a generalized mechanism to replace paths, for any simulator. This would be a table with two columns ("host" and "container"), that would allow the user to freely map resources in the host to resources in the container. For example, for a library the user might want to alternate between local sources and sources shipped with the container. In order to make it clean, these paths should be provided as optional arguments when adding the libraries in the run.py. So, it is not an easy task.

Apart from where/how to put the code, these are the functionalities I want to try in Python:

Wrap with docker exec -t -w /src vunit-sim bash -c. If the replacement is done in Python, there is no point in keeping the fakeghdl script.
Force path replacements .
- Here, I do not agree with your consideration about 99% vs 1%. At least, I don't understand it. The fact that files are relocated is not related with the meaning of those files. If an absolute path in the host is bind to an absolute path in the container, the replacement should be applied to any absolute path being passed. I understand that there might be problems if replacements where performed altering parts in the middle of the path, but that is not the case.
  - Anyway, filtering all the commands is just a brute force approach. Replacements should be applied to some elements only: _project.get_library(config.library_name).directory, _project.get_libraries(), and some fields in config.generics.items().
When I get the first prototype to work somehow cleanly, the next step is to run python in a container and the simulator in a different container, which is my real use case: the host should only contain sources, not a single tool or dependency (except docker). This is of higher priority for me than writting everything in Python.
Include the container creation and teardown in the main function of ui. However, I saw that there is not a single main, but four. Therefore, it might be difficult to make it "clean".

Maybe you could also use the docker API via python to launch GHDL. In that way there is no shell involved at all and everything is precise except path replacement which is not 100% semantically ok.

From my previous experience with the Golang SDK, it is just a matter of syntax. There is very little diference between using Process/subprocess or using the SDK directly for these really simple execution tasks. The real difference is felt when you need to get precise data from the containers/images, as it avoids lots of log parsing. That is, the SDK would be useful if we were to run multiple containers in parallel in order to execute different tests in completely isolated environments.

Also, the main point not to include the SDK is not to make VUnit dependent. I suppose that it is possible to tell VUnit to lazy-load some libraries, but I don't know how to do it (yet). As a general question, what do you think about adding the docker SDK as a dependency? I mean something as general as a --docker <IMAGE_NAME> optional arg to run.py. The implementation can be as simple as the solution that is explained in the docs, or as complex as what we are talking here. The point is that, shall you be open to a future clean integration, I would use the SDK in my branch. But, if you don't want any docker related code in the VUnit codebase (which I find completely legit), I'll reconsider my approach and make something external (e.g. an example "extended" run.py).

Regarding path substitution I think you can and should avoid it completely. To avoid it you just need to ensure that the vunit folder and project folder are identical in the docker image and the host. For the project folder that should be doable. For the vunit folder you can just place vunit next to the project folder and set the PYTHONPATH there and it should also be able to look the same to the host and image if the project could. For example if you have a /workspace/ in the host with /workspace/project/ and /workspace/vunit/ you can ensure that those paths are identical on the host and image.

This works. For example: https://travis-ci.org/1138-4EB/vunit/builds/368708071 There, both PROJECT_DIR and VUNIT_DIR are $(pwd). So, although there are a bunch of lines of code, essentially a single folder is being shared.

However, I am on a Windows 10 host, and I am using MINGW to interact with docker. Docker for Windows 10 is a very small Virtual Machine running on Hyper-V. There is so much going on with the paths that it is hard to ensure that the same can be used. Besides, I need to support at least two paths (VUnit can be located in any of them):

A common repository that is external to the folder containing run.py and the testbench.
The folder containing run.py and sources specific to the project.

Therefore, I need the approach to be flexible enough. If it is not possible, I can always wrap run.py with a script to copy all the external sources to a temporal dir next to it. Indeed, this is what I do now. But I'd rather explore alternatives.

kraigher commented 6 years ago

The way I see it the benefit has to outweigh the cost and complexity so lets analyse them.

Benefits:

Reduce image size: Avoiding to install Python in the simulator docker image which saved 25% or 100 MByte in your case.
1. However with commercial simulators the relative saving is less since these installs are often > 1 GByte.

Costs:

Increased complexity: VUnit path handling needs to be abstracted and VUnit internal codebase needs to consider a mapping between host/container in many places. It would require a lot of unit tests. It is not enough to do this just in the simulator layer of VUnit even though adding a new simulator is quite easy. Paths are added to generics by VUnit which is sent to the simulator layer, those paths have to be substituted before they become string generics.
Leaky abstraction: The user has to do path substitution in the run.py file for the things which VUnit cannot know is a path.
1. The user can set a path via a generic and VUnit has no way of knowing that it was a path and not some other string which would inhibit safe replacement.
2. The user can set compile_option or sim_option which depending on simulator could include a path. VUnit has no way of knowing that it is a path and cannot replace it with 100% certainty.

In this cost/benefit analysis I think the costs outweigh the benefit for VUnit. The way I see it 100 Mb larger docker image (25% saving in the best case scenario) is not worth increased complexity in the VUnit code base and presenting a leaky abstraction to the user. To me just installing Python in the image and get everything working without issues looks more attractive.

At work we have docker images for simulation using a commercial simulator and the size of Python in those are not significant. The docker image for the simulator is built on top of the docker image with Python which is already needed for the slave machines to have a controlled environment. In that way the Python layer is already cached on the slaves and the simulator layer is the only thing that needs to be downloaded again (Often that is cached as well).

eine commented 6 years ago

I updated dockerexec with a Python based solution that requires a minimum modification to the VUnit codebase. See README and travis-ci.org/1138-4EB/vunit/jobs/369877366.

The required modification is in vunit/simulator_interface.py: now, simulator executables are only searched in the system environment variable PATH, not in the PYTHONPATH (sys.path). In a python script, it is easier to programatically add a path to sys.path than adding it to the system PATH. Programatically adding a path is required in order to detect fake executables and let VUnit init when no other valid simulator is available in the system.

Apart from that, an issue raised when trying to use the snippet to get the installation path of VUnit: it seems that the existence of valid simulators is checked when VUnit is imported, neither when a class of the instance is generated nor when main is executed. Therefore, a circular dependency is generated: I cannot add the path to the fake executables located relative to the VUnit installation, because there is no valid simulator in the path. If I could read the installation path without checking for valid simulators, then I would add the path and VUnit would find a valid one. @kraigher, shall I open a separate issue to suggest making the simulator check later/optional?

The way I see it the benefit has to outweigh the cost and complexity so lets analyse them.

Benefits:

Reduce image size: negligible with most simulators but GHDL (25% uncompressed, 35-40% compressed).
Allow/support any docker image with a valid simulator, independently of the dependencies available there. E.g. the prototype works with any of these, without requiring to generate other 14 images.
Allow a local VUnit to run simulations in remote containers. However, this requires additional resources to securely expose docker sockets in remote hosts. Useful in isolated lab networks.
Achieve a modular scheme, where each container fits a function.

Costs:

Increased complexity: a wrapper class is provided to the user, which is loaded instead of vunit. The class handles the creation/removal of containers, and the setup for VUnit to use fake executables. Yet, the user is required to be aware of all the paths in the host that are used during the simulation.
- Safe approach: if the user ensures that all the sources/data are inside the directoy where run.py is located, and all the paths are relative, no additional set up is required.
- Possible incompatibility: if absolute paths are used, some tests might only be configured to be run either with this modular approach or locally with the default solution. This is because all the sources must use the path formats from inside the containers, and it is not always possible to use the same absolute paths in the host and inside containers (see this comment). However, for most of the use cases, setting up the volumes is straighforward.
Less transparent debugging: some errors might arise due to the requirement to use the path format from inside the containers. Also, if a git repo is cloned in windows, the line-endings of some files might change and produce errors when executed inside GNU/Linux containers. Now, although successful logs/outputs are the same as with the default execution of VUnit, when errors arise not all the log is shown. I'd consider this a bug of my implementation, but it is worth mentioning.

VUnit path handling needs to be abstracted and VUnit internal codebase needs to consider a mapping between host/container in many places. It would require a lot of unit tests. It is not enough to do this just in the simulator layer of VUnit even though adding a new simulator is quite easy. Paths are added to generics by VUnit which is sent to the simulator layer, those paths have to be substituted before they become string generics.

The current approach is to start two containers, one with VUnit and another one with the simulator. Both containers have the same volumes/directories bind to the same paths in the host. Therefore, no path handling/mapping is required inside VUnit, because all the paths available to VUnit are also available to the simulator.

Leaky abstraction: The user has to do path substitution in the run.py file for the things which VUnit cannot know is a path.

The user can set a path via a generic and VUnit has no way of knowing that it was a path and not some other string which would inhibit safe replacement.

The user can set compile_option or sim_option which depending on simulator could include a path. VUnit has no way of knowing that it is a path and cannot replace it with 100% certainty.

As explained above, now the user is explicitly aware of which paths are used inside the containers, and sources must be written for these. Therefore, not all existing tests can be directly converted.

In this cost/benefit analysis I think the costs outweigh the benefit for VUnit. The way I see it 100 Mb larger docker image (25% saving in the best case scenario) is not worth increased complexity in the VUnit code base and presenting a leaky abstraction to the user.

I think that the current approach levels the balance up. There is no increased complexity in the VUnit code. Indeed, there is no need to merge anything but the modification at the beginning of this comment. Then, shall anyone be interested on splitting VUnit from the simulator, this issue and the referred branch suffice.

To me just installing Python in the image and get everything working without issues looks more attractive.

AFAIK, there is no straighforward solution to add a docker layer on top of two or more different images, even if those images share a common parent. I.e., there is no equivalent to git merge/rebase in docker. Therefore, installing the same Python and VUnit version with four different simulator containers requires repeating the same process four times. Keeping those images in sync between different machines when there is no private registry available is painful. Allowing a single VUnit docker image to talk to any container produces less update issues.

The docker image for the simulator is built on top of the docker image with Python which is already needed for the slave machines to have a controlled environment. In that way the Python layer is already cached on the slaves and the simulator layer is the only thing that needs to be downloaded again (Often that is cached as well).

In my case, the simulators are available in images which do not have Python installed. I.e., debian:stretch-slim. Therefore, in order to generate versions with python, I start from the corresponding python image (python:slim-stretch) and --copy-from the simulator image. As a result, it is easy to forget copying something, plus installed runtime dependecy versions might not always match. Furthermore, if the simulator is updated, two layers need to be sent, even if both contain the same data.

I asked about this recently and I was referred to moby/buildkit but it seems to be in early development (yet).

kraigher commented 6 years ago

better to reopen this issue which has all context

kraigher commented 6 years ago

The required modification is in vunit/simulator_interface.py: now, simulator executables are only searched in the system environment variable PATH, not in the PYTHONPATH (sys.path). In a python script, it is easier to programatically add a path to sys.path than adding it to the system PATH. Programatically adding a path is required in order to detect fake executables and let VUnit init when no other valid simulator is available in the system.

Seems very wierd to try to find simulators in the PYTHONPATH, I would never expect to find simulators there.

It is quite easy to set the PATH environment variable within Python.

import os
os.environ['PATH'] = os.environ['PATH'] + os.pathsep + "stuff"

Also to be explicit about where the simulator is located you can use VUNIT_<SIMULATOR_NAME>_PATH such as VUNIT_GHDL_PATH.

Anyway regarding the simulator detection on import I can move that to the instantiation of the VUnit class.

eine commented 6 years ago

Seems very wierd to try to find simulators in the PYTHONPATH, I would never expect to find simulators there.

I read that the PYTHONPATH should be modified with sys.path, because using os.environ has no effect once the script is loaded. But, you are ok, this has nothing to do with the PATH for executables. I just mixed both things.

So the branch is now edited to use os.environ['PATH'], as you suggest.

Also to be explicit about where the simulator is located you can use VUNIT__PATH such as VUNIT_GHDL_PATH.

That was the very first approach, but I'd like VUnit_docker to 'automatically' detect the available fake executable(s) for the user, because I have not added envvar passing to the containers, yet.

Anyway regarding the simulator detection on import I can move that to the instantiation of the VUnit class.

That'd be great, as it allows to replace VUnit_docker.py#L20-L37 with:

try:
    import vunit
except:
    sys.path.insert(0, '/work/vunit')
    import vunit

environ['PATH'] += pathsep + abspath(realpath(dirname(vunit.__file__)+"/../bin"))
ui = vunit.VUnit.from_argv()
return ui

eine commented 6 years ago

After the recent commits to the master branch, the snippet above is now implemented in dockerexec. Therefore, the prototype is functional and quite clean for my requirements. I might update the branch in the future, to add extra features that might be needed in more complex use cases (e.g., swarms). But, for now, I need to move on.

All my questions about VUnit are now answered, and the suggested and accepted changes are already merged. So, I think we can close this issue. Anyway, I'll leave it open, just in case you want to keep it as a "live" reference of what the title describes.

@kraigher, thanks so much for your patience, time and effort. Your help, and having this issue solved, let me learn quite a lot about Python and the internals of VUnit.

VUnit / vunit

Let VUnit execute simulators which are inside a container or VM, from outside #324

Interception/decoration script

Examples: from local to docker run to docker exec

Examples: from local to `docker run` to `docker exec`