ansible / workshops

Training Course for Ansible Automation Platform
MIT License
1.73k stars 1.13k forks source link

Grafana Dashboard Progress Tracker #435

Open ffirg opened 5 years ago

ffirg commented 5 years ago
SUMMARY

I've been using a Grafana based dashboard to track student progress during workshops. This is useful as timings and progress can vary between sessions. It's a useful gauge to keep things on track.

ISSUE TYPE
COMPONENT NAME
ADDITIONAL INFORMATION

I have a pre-configured InfluxDB/Grafana single VM combo which I use for the recording of completed exercises. Each student runs a small shell script, which creates a playbook, which is run at the end of each exercise to record progress.

The student runs something like this:

$HOME/workshops/exercises/ansible_rhel/completed.sh 1.2-adhoc

(see it in action at: https://github.com/pharriso/workshops/tree/master/exercises/ansible_rhel/1.1-setup#final-step-mark-exercise-as-complete )

It's not a perfect solution, as I currently have to git clone the necessary repo so the completed.sh script is in place. But it does work!

Standing up the InfluxDB/Grafana server could be done at the provisioner stage?

You end up with a dashboard like this:

Screenshot 2019-09-16 at 10 19 31 PM
liquidat commented 4 years ago

The idea is great - as mentioned in the other issues, we should combine the work we do across the workshops. You have code for this ready, right? Why not start with a PR for this? Or work with @cloin directly on this?

ffirg commented 4 years ago

@liquidat @cloin

Been working on the dashboard idea and have something in draft working. I’ve forked the workshop and added a new role - https://github.com/ffirg/workshops/tree/devel/provisioner/roles/progress_dashboard https://github.com/ffirg/workshops/tree/devel/provisioner/roles/progress_dashboard

In order to test I’ve been firing up a small RHPDS rhel workshop and using this little snippet to inject the dashboarding into the student1 control node: https://github.com/ffirg/workshops/blob/devel/provisioner/dashboard_snippet.yml https://github.com/ffirg/workshops/blob/devel/provisioner/dashboard_snippet.yml

There are a couple of things I’d like to fix: I’m using admin_username/password vars for ease of development, should switch to using those created on demand when the lab is spun up Grafana HTTPS seems to be a complete pain in the ass bitching about TLS so I’ve switched to http, as it was driving me nuts

The above puts the dashboard framework in place.

We then need a way for students to update progress or automatically do this as they proceed:

  1. The student could be asked to run a generic script/playbook which takes the lab type (from the dir structure) and exercise number and updates the DB. Each exercise would need instructions to run something at the end of the exercise. I’ve used this approach in past workshops and it works.
  2. There is also something like Pythons watchmedo (1). This can watch for files (so the playbooks etc as the student creates them) and run ’tricks’ when file are created. Would work great to ‘engine’ exercises but not Tower OOB.
  3. I looked at systemd file monitoring as well, but that isn’t flexible enough, so I binned that approach.

(1) https://github.com/gorakhargosh/watchdog https://github.com/gorakhargosh/watchdog

I’ve not put in a PR yet as I wanted to get your opinions mainly on how to proceed with how students would update progress.

On 13 Mar 2020, at 17:05, Roland Wolters notifications@github.com wrote:

The idea is great - as mentioned in the other issues, we should combine the work we do across the workshops. You have code for this ready, right? Why not start with a PR for this? Or work with @cloin https://github.com/cloin directly on this?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ansible/workshops/issues/435#issuecomment-598823402, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACES2KUMSGUFRPL2VXBLTNLRHJRVBANCNFSM4IXHMSCQ.

liquidat commented 4 years ago

@ffirg I would prefer to watch the files automatically instead of letting students launch a shell file. watchmedo looks promising. We could also make all playbooks "--dry-run" capable and run them in dry-run every few seconds and report when any of them runs through without problems. We want to have the playbooks on the machine anyway to provide the students the solutions in case they are too lazy to do copy&paste. If we use such a "ansible playbook daemon" we could also query certain expected Tower configs and report them.

Though tbh I am not sure how much work that would be.

Maybe @cloin has better ideas?

cloin commented 4 years ago

@ffirg, @liquidat I was talking with @cigamit about this last week as well.

Skylight had something like this and it depended on a student running some thing that could be collected by the progress checker. If there is value in a progress tracker, I’d like to try to apply it for all workshop types. SO this means that each workshop type would have it’s own dictionary of things to look for that would track progress. I think this quickly becomes too complicated.

I was thinking... what if it’s a bit higher level? What if we work to implement for all workshop types some generic statistics. Break down some tower metrics/activity by student number collected from each student’s tower api. Track job run success/failures for each student, track credential/project/template creation etc.

This would allow us to apply to all workshop types that include tower installations for each student and allows the instructor to check in with students who are experiencing a high number of failures or do not seem to be progressing as quickly through the workshop based on how many resources have been created. This way, nothing is required of the student. Of course, that means we don’t get to see exactly where they are but just having SOME data on every student tells us they’re at least making some kind of progress and they’re interacting with Tower.

If more granular collection needs to be done, that could be a secondary set of things that are collected per workshop type.

cigamit commented 4 years ago

Maybe its just the wording, but a correction on Skylight. The students didn't have to run anything. Since they were saving their playbooks to Git, and we were telling them to create very specific folder and file names during the lessons; we just did Git checkouts on each of their repos and looked for the file/folder names for each lesson to check their progress. We didn't check validity of the playbooks, but we could have grepped them for specific portions if we wanted (or linted them).

Colin is correct that each workshop will require their own way of checking, since every workshop does things completely different. Some, like the Windows workshop, can be executed from a single location (git checks, tower api checks) while others may have to login to each machine and check them.

We don't have to technically tackle this for every workshop from the git-go though. We just have to decide on a framework of how to enable these checks, and where to push the data to.

ffirg commented 4 years ago

The greatest benefit of a progress tracker for me, is so you can gauge where students are at any point in time. We’ve been finding that students in more recent workshops aren’t getting through the material like they were before. I know we’re doing lots of work around streamlining the content so a progress would enable us to monitor these changes per workshop, or if there was one generic DB/dashboard then it could be used to centralise all workshops and report against that.

I think the simplest and original idea is still the best, and I’ve used it in anger across a number of workshops. Get each student to run something in the exercise instructions at the end of each exercise. That way we don’t need to keep any dictionary of things to check, avoiding complication.

Just drop the one shell script into the central repo, which every student uses to run at the end of the exercise.

I’ve used this before:

cat /Users/pgriffit/workshops.old/exercises/ansible_rhel/completed.sh

!/usr/bin/env bash

#

update InfluxDB - student progress checker

#

if [ -z $1 ] then echo "Please supply lesson as an argument" exit 99 fi

prereq="influxdb" export lesson="$1" export student="$USER" export lab="rhel"

pip show $prereq >/dev/null 2>&1 if [ $? -ne 0 ] then echo "Need to install pre-reqs..." pip install influxdb --user >/dev/null 2>&1 fi

cat >${HOME}/influxdb-update.yml<<EOF

ansible-playbook ${HOME}/influxdb-update.yml

If we can assess the lab type and exercise from the directory structure, then these can be silently passed into the script/playbook, so they just have to run ./completed.sh at the end of every lesson. We already have the student number through $USER.

Thoughts?

On 20 Apr 2020, at 14:02, Colin McNaughton notifications@github.com wrote:

@ffirg https://github.com/ffirg I was talking with @cigamit https://github.com/cigamit about this last week as well.

Skylight had something like this and it depended on a student running some thing that could be collected by the progress checker. If there is value in a progress tracker, I’d like to try to apply it for all workshop types. SO this means that each workshop type would have it’s own dictionary of things to look for that would track progress. I think this quickly becomes too complicated.

I was thinking... what if it’s a bit higher level? What if we work to implement for all workshop types some generic statistics. Break down some tower metrics/activity by student number collected from each student’s tower api. Track job run success/failures for each student track credential/project/template creation etc.

This would allow us to apply to all workshop types that include tower installations for each student and allows the instructor to check in with students who are experiencing a high number of failures or do not seem to be progressing as quickly through the workshop based on how many resources have been created.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ansible/workshops/issues/435#issuecomment-616538102, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACES2KTM5C5JBDZLTAP72WLRNRBXNANCNFSM4IXHMSCQ.

cigamit commented 4 years ago

As we said, every workshop is different. In the Windows workshop, there is no terminal and no command line. So in order a user to run something, they would have to import a repo, create a template in tower, and run it from there. It would add a bit of time to the workshop. We could pre-import all that for them, but we then again run into a scenario where we have to rely on the students to run it. If we are finding them unreliable in completing the lessons, can we trust them to run the "lesson" complete script? They could go the other route and not complete the lessons at all and instead just run the complete script for each lesson.

IPvSean commented 4 years ago

@cigamit what if we dedicate the student1 workbench for the instructor and he just runs the instructor inventory that can check based on workshop_type in check mode that a lesson has been complete. @Spredzy has been using scripts to grab the solutions for each exercise... we could create a framework here and just start with windows or rhel workshop to test it out? Can we host this algorithm/playbook on the control node?

cigamit commented 4 years ago

Would partially work, but would still show some "changes" on several of the Windows exercises, since they create survey prompts and can throw just about anything in there. Some of the lessons also don't have an "exercise", but are manually setting things up in Tower. Also we have some bonus exercises that result in removing IIS, etc... which would make the original exercise playbook checkmode fail.

I think we will run into this issue with every Workshop, as they do everything completely different. In the end, we may have to end up writing "grading" playbooks for each workshop type, if we want to do it right. This is also why I stuck with checking their actual git repo for files / folders that should exist.