mmh352 / ou-container-builder

0 stars 2 forks source link

Add name and other metadata to configuration file #7

Closed psychemedia closed 3 years ago

psychemedia commented 3 years ago

At the moment, the module metadata in the config file supports a module code and presentation but no other way to identify the module. It would be useful to be able to give the image a name, not list to distinguish images where a module makes use of several different container types.

module:
  code: DEMO
  presentation: 21J
type: jupyter-notebook
content:
  - source: content
    target:
    overwrite: if-unchanged

This also raises the question of what other metadata might be useful.

mmh352 commented 3 years ago

If a module requires multiple containers, then a suffix should simply be added to the code field (say TT284-B1, TT284-B2). The builder doesn't actually care what is in that field. If more complex tags are needed, then those should be applied post-hoc.

With regards to the description, where would that be shown / used?

psychemedia commented 3 years ago

Re: where would description be shown - I'm thinking of it as metadata so it just needs a conventional location in the container. Something else I have found useful is putting a version number and/or build date into a conventional location.

Some students who postpone, and some ALs, often run legacy containers/images, and it can be handy having a version.txt file in a conventional location deep inside the container we can just ask them to check. (In MyBinder/repo2docker builds, I used to use ${CONDA_DIR}/version.txt because that persisted. I've also previously stashed things in /var. When debugging in student (or AL) forums, it can be really useful sometimes ("Oh, we haven't used that for the last three years...";-))

psychemedia commented 3 years ago

The description can also be saved into the Dockerfile and stashed in the image if we follow that convention.

mmh352 commented 3 years ago

I'm still unconvinced about the description. Who would read this? In general the content delivered to the student in the home directory of the container will contain a README or similar to explain where they area. Why would there be a need to have a second, very much abbreviated version of the same somewhere in the container where nobody will see it?

I agree with the need to version containers. However, for that end the system uses the Docker tags. Each code-presentation combination generates its own repository and then the ":xxxx" label can be used for versioning. So you would have something like TT284-21J:latest or TT284-21J:v4. This information is always available via docker. Additionally the image has a unique hash, which is also available to the student running the container. I don't see the need to duplicate this.

psychemedia commented 3 years ago

When you start having to support students in the forums, tell me you don't need every bit of help you can when trying to get them to do things / tell you what they see ;-)

Plus, bear in mind that "just" instructions will need unpacking over several paragraphs (i.e. "what container are you running" unpacks as "just tell me what container you are running" unpacks to "You meed to tell me what container you are running. To find out what container you are running, follow the next seven steps" etc ;-)

Re: description, this things will live in a workflow. If you look at OU-XML, you'll see it has something for everybody in terms of stuff in the header ;-)

If you have multiple images for a module, you may want to be able to quickly see what each one was for. So eg:

RUN echo "My container description" > /var/ou-container/desc.txt

gives you:

docker run -it outest2 cat /var/ou-container/desc.txt

Also docker run -it outest2 cat /var/ou-container/Dockerfile etc.

And baking the description into a Dockerfile also gives you more documentation in the Docker file.

(It's not necessarily you this is for, it's for person or persons unknown who have to work with this stuff in the future.)

PS FWIW, I will probably bake my own metadata, version file etc into the container via the script command because I know from experience that one day in the future it will make handling a particular support issue easier ;-)

psychemedia commented 3 years ago

One other thing I noticed, in the build, I keep seeing <none> as the image name (maybe if I don't tag it? I think if I do tag it it gets the name mmh352/$CODE-$PRESENTATION), which means i need to run up a container by referencing the image ID not name.

Two observations about that:

  1. May be useful to allow setting of something other than the mmh352 component;
  2. May be useful to allow setting of something other than the $CODE-$PRESENTATION component

Generally, give folk (optional, away from default) control over how they create IMAGE_NAME = ORG/CONTAINER:TAG.

mmh352 commented 3 years ago

I understand the need to find the exact container version. However, I fail to see how that is easier than asking students to run

docker ps -a
docker images --digests

and then sending you the output. The advantage here is that you don't have to then go finding the container for that version. You can just do docker pull xxx:DIGEST and you have exactly the same container as the student. I'm not against backing in a version number, I would just first like to see that there is a scenario that cannot easily be covered with the existing docker toolset.

Regarding the description, yes, it will obviously live in a workflow, but storing descriptive workflow metadata in the container is, I believe, the wrong place for it. OU-XML is so full of hacks and implicit behaviour, that I feel the only thing you can learn there is how not to do things.

As you say, you can always use the "file" section to copy arbitrary files to arbitrary locations in the container. Then it becomes possible to evaluate whether they are needed, in which case the tooling can be updated.

psychemedia commented 3 years ago

I'm just speaking from experience and am happy not to press you on this further. (Some students struggle to find the command line, let alone run things on it and tell you what they then see. Also, if you think getting them to copy and paste commands into the command line is simple, I can probably find some recent examples to debunk that.)

I'm happy to bake my own cribs into containers in modules where I'm expected to offer tech support...!

Also, re: things like description, if you are developing images then it may be handy to just drop notes into them as you develop them, which a description field can be handy for.

You're a far more professional developer than I am, I need all the help I can get to help myself keep track of multiple different images in various stages of brokenness that I'm working on;-)

mmh352 commented 3 years ago

I am, much too, aware of the difficulty with getting people to run stuff in a command-line :-D. My point is really just that the docker solution is one less thing the image developer has to manually maintain, while being no more complex to use than a home-built solution.

Agree with the usefulness of a description for keeping notes as you develop, but these don't belong in the container, they belong in the documentation that goes along with the container. Then both things are easier to maintain.

I'll close this for the moment. Probably best to pick it up as a discussion, if/when we want to revisit that.

psychemedia commented 3 years ago

Sure, tho' I just need to bite one last time: "both things are easier to maintain". Things are easier to maintain when they are self-contained and in one place! As you say, probably better in a more general discussion.

mmh352 commented 3 years ago

Sure, it is easier to maintain if the ContainerConfig file is the only one you have in the project. However, if you also have notebooks, tutorial texts, data-sets, ... in the project, then the description/documentation should cover all of those. At this point it does not belong in the ContainerConfig and maintenance becomes harder, because the developer has to remember to also update the docs in the ContainerConfig.

psychemedia commented 3 years ago

Another observation on "why not" the command to find the tag: if a student is working in an container via a browser UI, and getting errors, then support is easiest if they run commands in that environment to find which version etc it is.

(I'm guessing, but don't have any way to check, that some students who have problems with docker are running variously named containers, and deleting then repulling images and running new containers with abandon and zero knowledge; and as a result would have no idea what environment they're in or how to decipher docker ps commands!!)