Open pwalczysko opened 3 years ago
I'll add a few thoughts:
1) We've been playing with the idea of a self-contained "omero-training" docker-compose that we can just spin up in the cloud that would already include a preset database with users, imported data and so on, so I guess an ansible playbook with these features would fulfill a similar need. Ideally, I would love to have an easily-deployable "OME-default OMERO training server" that could run in the cloud, for example.
2) Our users/groups approach was to use OME scripts to create 50 training users in a "training" group, and between training sessions we "clean up" and undo most of the changes that happen during a training session. We're already using OME scripts to do most of this - I think currently the main manual steps are deleting tags and attachments IIRC. 3) To avoid data multiplication we in-place import the same data to every user, which is good, but people still get confused when trying to look at each other's data: they see the same data again. Maybe it might be worth in-place importing the same data, but changing names ("user1_randomimage.tif" instead of "randomimage.tif"). Something for the future!
3) The Crop workflow under OMERO.figure is the kind of functionality I only go into if I have a lot of time to explain it in detail. I understand how useful it can be, but in general for a training session that is limited in time I stick to zoom/pan of multiple panels and don't touch it at all. Once in a while someone will come to me with a "can I do X in OMERO.Figure" question and I'm more than happy to explain that when needed outside the training workshop context.
4) For Python+OMERO, we've moved to using a lot of our own convenience functions in training. Last time we ran an "OMERO for developers" workshop, we spun up a JupyterHub in GCP and cloned our repo with notebooks (https://github.com/TheJacksonLaboratory/omero_for_developers) directly there. Because our GCP instance has VPN access, we could still do work in our internal dev server.
Some ideas too We woudl also love some docker (or ansyble). It would be nice if somehow the groups/users would be configurable through some kind of yml or toml file. Actually for our training I would like to create more meaningful users based on known French Comics. That would not only add a funny side, but also, users will identify more clearly some of the concepts. If configurable, people may use lists of known groups that are known locally. You may use pictures and demo that feature (I do not think we would run into copyright issues if it is us grasping the image from a URL or so). e.g.
- group: petit village gaulois
- owner: Abraracourcix
- users: Astérix, Obélix, Idéfix, Panoramix, Assurancetourix, Ordralfabétix,...
- group: Tournesol Lab
- owner: Professor Calculus
- users: Tintin, Captain Haddock, Bianca Castafiore, Nestor, Dupond, Dupont, Milou ...
I agree on the advantages and disadvantages of using the same data. One way around might be that all data are the same but one dataset used to demo the sharing where, each user has a "personal" image: a pencil, a lab timer, a lab book, a coffee mug,...
Two thoughts: YML could make good sense, e.g. in ansible, to synchronize the user/group setup, perhaps as a new CLI plugin?; however, at the moment the closest thing that exists would likely be LDIF --> https://github.com/ome/apacheds-docker
Right, but this is to really manage the acounts... no? I was thinking of something to create a training server that is going to be trashed a few days after the training. The YML could also specify / customize the data to import and different kinds of annotations.
I guess I would urge us to make elements which are composable so that others benefit. e.g. anything that is a CLI plugin can be consumed pretty easily from ansible, etc.
Across OME resources, two places managing user/group creation (differently) are:
Adding my 👍 to the idea of unifying these efforts and converging towards:
@erickmartins @juliomateoslangerak Thank you for your very valuable thoughts and pointers to the teaching material.
Re: ...have an easily-deployable "OME-default OMERO training server" ... and "...thinking of something to create a training server that is going to be trashed a few days after the training...."
The main thing which in our experience is to be thought through here is whether or not this server contains any image data, or how does it connect or import such after it has been spun up. In our hands, the data import is the main burden when setting up a training server, and thus we do not (as I understand @erickmartins does not either) trash our training servers, prefering to cleani them instead between trainings. Another thing to consider is also that your trainees will not like too much that their training environment is trashed quickly after the training.
@erickmartins
I think currently the main manual steps are deleting tags and attachments IIRC.
Thank you for emphasizing the importance of cleanup. I will try to come up with a cleanup section in the next version of the walkthrough and attach it here. Briefly, some cleanup scripts concerning tags and attachments (file attachments and MapAnnotations) cleanup are available already, let me try to come up with their description in the walkthrough. You will of course need some adjustments to make them work in your hands/setup, but they give a good lead.
@juliomateoslangerak
Re: would be nice if somehow the groups/users would be configurable...
Additionally to the comments and clarifications of @joshmoore and @sbesson , which I agree with, one thing to consider here is how to match the flexible userbase on your servers with any preimported data in a ready-made docker-compose OME training server or similar. This means, rather than creating the new users from scratch every time (such users would by definition have no data to start with), the suggested functionality should rather keep the userbase the same, and manipulate only the naming of them.In our experience, the image data and their quality have much more impact on the participant satisfaction than the user/group setup. Of course, a docker-compose or otherwise pre-fabricated OME training server is a good idea even without any data, but it has some value to see the bigger picture here as well, as it might help to spare resources.
We have a "renaming script" for the usernames, and it would be easy to extend the renaming to the groups too. The script uses names of famous scientists.
@juliomateoslangerak You can insert "Asterix" and similar to the script, but one, at first look possibly less important remark is that every user in OMERO must have a first name and last name, which Asterix does not (I like Asterix a lot too though 😄 ). The solution of having first name: Asterix
and a last name: Asterix
does not come accross very well for the participants (displayed in OMERO.web as Asterix Asterix
, which looks like a bug of the software - OME Team worked previously for a long time with user-1 user-1
type of names in the trainings, so this remark comes from a hard-won experience 😄
@erickmartins
Re your points 3., 4 and 5.
Thank you for these, they concern the naming of images, a balance between longer and shorter workshop and how to prioritize shown features, and teaching of OMERO API materials (thank you very much for this !). I will try to integrate these points into the next version of the workflow.
Indeed naming users as I suggested will look odd. I remember now being 'disturbed' by this user name duplication in the past. For the other comments I'm not sure I was clear or don't understand what you mean (or both). What I guess I would like, in order to setup a training server on a regular basis, under the idea that the training server is trashed a few days after the training (my assumption is that the users will further experiment on the production server)
Does this make sense to you? To me, the great advantage would be that the trainers would be able to easily setup the training environment from the convenience of their regular daily server. I looked carefully into the maintenance scripts you provide and it seems that more or less everything is in there and actually many of the scripts are doing this.
@juliomateoslangerak Thank you for this. Yes, your idea of docker-compose is good. What I wanted to point out (sorry for this being unclear, as I tried to answer too many questions):
from the convenience of their regular daily server
Maybe I do not completely understand this point. Do you maybe mean that the data are in-place imported in the "regular daily server" and all is needed is to in-place import them into the "newly born" training server ? Please expand on this point.
As indicated, let us discuss further, there will be certainly adjustments and compromises needed. Thanks a lot.
- a docker-compose to setup the server and eventually run a setup script
Certainly doable. It may be as easy as either (1) running the training playbook within docker or (2) running it against docker.
- a setup script to copy a number of projects from user from a given server into new users data. This script would consume a configuration file
As @pwalczysko points out, this is the hard one. But certainly what we're all working towards. Until then, you'll need to find a version that works in your setup.
- a config file to configure groups, users, user names, the source host and projects,...
Especially if you are doing all of this in Docker, you might think about adding in the openmicroscopy/apacheds docker and just save an image with all of the users you want.
Hi thanks everyone for the advice. @pwalczysko perfectly logic that you don't want to hack such a script to deal with all possible scenarios. I think, for us, and I have to double check with Christophe Blanchet (the responsible of the IFB infrastructure that is going to host our training server), the space limitations is not going to be a problem, given that it is a training server for not a lot of people (not 50 anyway) and for a 2 days period, I worked/hacked in a script that is duplicating a list of projects for every user. It is so far only duplicating the projects, datasets and images. No other annotations yet, but I will work on that for a second training. I think for this one manual annotation will be faster. We need this for next week!! No in-place import yet. One little question there: if I user in-place import and a user deletes an image-original file, will OMERO delete it or keep it as long as there is another reference to it in the database...? https://github.com/juliomateoslangerak/training-scripts/blob/master/maintenance/scripts/copy_projects.py @pwalczysko When I talk about the convenience of our daily server is the following. I setup a group in our prod server where I keep, together with other trainers, the data that is going to be used in the training, as it is going to be used in the training. Nicely organized, annotated,... When come across data that is interesting for the training, so as we do for other purposes such as scientific presentations, examples, etc,..., we can add them to that group too. Well the day before the presentation, we create a server, we launch the script and we get the same 'environment' for each trainee.
I do not know what you think but, when moving to more basic trainings, these might facilitate core-facility engineers without a deeper knowledge of OMERO or the tools to configure a training server, to setup basic training sessions.
@juliomateoslangerak thank you for this this is really helpful.
I worked/hacked in a script that is duplicating a list of projects for every user. It is so far only duplicating the projects, datasets and images. No other annotations yet, but I will work on that for a second training. I think for this one manual annotation will be faster. We need this for next week!!
Great, thank you for that script. I will try to test it myself, time permitting. I wonder how it willl perform with the download and reimport of the image data, and possibly have a question about how the https://github.com/juliomateoslangerak/training-scripts/blob/master/maintenance/scripts/copy_projects.py#L49 bit would work for more complex file formats such as .vsi
. Please let us know how the script performed on your side.
One little question there: if I user in-place import and a user deletes an image-original file, will OMERO delete it or keep it as long as there is another reference to it in the database...?
Thank you, this is a not so little question. The clear answer is: OMERO will never delete original files imported in-place. It does not matter into how many OMERO servers you imported these files in-place into. The supposition which I am making in this statemnt is that indeed these files are in a safe, read-only place (before and after any import to OMERO), this means not in OMERO's Managed Repository. The files being in a safe, read-only place is a recommended and standard situation for in-place imports. All what OMERO is deleting when you are deleting the images from within OMERO are the links to your original files, which, as we established, live in your "in-place", i.e. read-only place. This setup is used also for the training servers of the OME Team, Also note that in order to import in-place, you need to do it from the environment of the machine where that particular OMERO.server is running on. In your case, I am guessing that the production, "regular daily" server data are not imported in-place, but using the classical import route. In such case, what I wrote above is not fully valid. The files are sitting inside the Managed Repository of your production server. The production OMERO server, if the deletion would be executed from it, would delete these original files from its Managed Repository too. We can expand and discuss this situation too, but maybe please specify your setup first.
@pwalczysko When I talk about the convenience of our daily server is the following. I setup a group in our prod server where I keep, together with other trainers, the data that is going to be used in the training, as it is going to be used in the training. Nicely organized, annotated,... When come across data that is interesting for the training, so as we do for other purposes such as scientific presentations, examples, etc,..., we can add them to that group too. Well the day before the presentation, we create a server, we launch the script and we get the same 'environment' for each trainee.
Thank you for clarifying that, this makes perfect sense. This is actually similar to the way we (OME Team) apprache our training servers, with the exception that the "nice things which we want to use in next training" are accrued and kept on the training servers themselves. Nevertheless, lot of "nice things" are kept in the IDR too, and we do have some features of what you are drafting here, our "production server", ie. "regular daily server" being the IDR.
Design issue with the goal to come up with a walkthrough for an external (non-OME-Team) scientist(s)/facility, which should help them to set up a training on OMERO on their site, but without direct participation of the trainers from OME Team.
Please see attached pdf 2021-04-14-training-walkthrough.pdf for the item-by-item walkthrough - this design issue is a work in progress, and new pdfs will come as we are going further with the prep of this walkthrough.
The outline of the walkthrough points:
LIST OF VERSIONS of DETAILED WALKTHROUGH (to be expanded as new content comes)
cc @jburel @juliomateoslangerak @sukunis @erickmartins