Extend telegram bot - Githubissues

Awesome library!

Is there a way to extend the telegram bot to answer more commands?

I suppose I can edit the source here and recompile, but I wonder if there is a different way to do this.

Hey, thanks for the support! We did a brief brainstorm about something like this, to allow users to issue additional commands to slurm cluster via bot, e.g. issue a scancel to a job etc. But after brief thinking, we dropped the idea because it seemed we'd need to implement quite a lot to make it safe, like authentication/authorization. So we focused just on the basic functionality to deliver messages, which is a safe one-way communication (bot->user).

As for how to go about it, it would be same as you have mentioned, edit that part of the code, assign handlers for additional commands and fill out the functions. I don't know of another way. What did you have in mind to add in terms of commands?

we dropped the idea because it seemed we'd need to implement quite a lot to make it safe, like authentication/authorization

Makes sense! It's safer if the user just logins to the cluster via the usual methods.

What did you have in mind?

Haven't defined this yet, but as ideas:

query jobs state, e.g. check state of a job, check squeue, check old jobs
configure the bot's verbosity (per user)
Customize bot's message sent on "/start"
cancel jobs (though this would be particularly unsafe, so can be left out)
bot could send notifications regarding the cluster, e.g. "warning: you are reaching your storage quota" (could be out of scope for this project) (see next comment)

I want to use the telegram bot to send more notifications regarding the cluster, e.g. "warning: you are reaching your max storage quota". I'd say is a bit out of scope for this library to implement it directly. But, it would be ideal if we could reuse functionality, something like:

import telegramBot, username2chatId

chatId = username2chatId("some telegram-username")
# note: could be unsafe (the user would have to register their cluster-username via the telegram bot?)
# we might have to work directly with chatId

telegramBot.sendMessage(chatId, "some message")

Or even for any connector with a go package?

import goslmailer
goslmailer.send("some message", "some username", "some connector")

or from the cmd line

$ goslmailer-send "some message" "some username" --connector telegram|email|etc

Does this make sense? I'm just kind of brainstorming here :grin:

Hey, that's some interesting brainstorm you've dropped here. Made me think hard about all that. It does make sense, although i'm having trouble seeing the final picture, it's a bit too far away, still too vague. So let's try with some questions to clarify the vision up a bit, lets call whatever it is a "product", and i'll just unload the thoughts...

I've picked 2 use cases that i found to be technically "different", then i try to envision how the end product would look from the user perspective, and which components would i need to have in place to execute that product.

list users jobs (squeue) - easy one
quota warnings - complicated one?

For those two i tried then to envision what needs to exist to execute them. So, the user opens his phone, goes to the chat with the bot and lets say does:

/jobs to get his jobs in the queue, now this is something that i see as initiated by the user, bot can then do a squeue, perhaps some parsing, prettifying and returns the list back, quite simple...
the quota warnings, now, that's something not initiated by the user but by the state of the system itself. So, what would be needed then is: a) user must register for quota warnings (e.g. /quotanotify command to the bot (some users might not want this sent to them)) b) bot needs to start monitoring this users quota or there is another monitoring component sitting behind the bot to which it registers the user to monitor his quotas and send back the bot a notification to alert the user [monitor]<-->[bot]<-->[user]

In both cases, what's def. needed is a map between telegram user uids and cluster uids (t-uid<-->c-uid), Could be done manually, or with some tokens generated at the cluster with which users then authenticate themselves to the bot (here we're entering dangerous territory :laughing: ) Also to keep things useful, monitor must be configurable/modular enough to be able check quotas on different FSs (e.g. beegfs-ctl --getquota, etc.)

`flow-notify`

This flow covers UC-2 (notify quota warnings)

I see something like this: [monitor] --> [messager] --> [connector] --> [user].

`[monitor]`

Process/service in the cluster that decides when to send messages to the user
Currently, SLURM itself works as a [monitor]: calls MailProg on events START, END, etc
To implement UC-2 we'd need to implement a [monitor] (e.g. monitor-quota) that checks user quota and decides when to notify the user. For example:
- A simple script run periodically with crontab
- The script: (1) checks user quota, (2) decide which users should notify, (3) calls MailProg --user <chat-id> --message "warn: you're near the quota"
(for now) To avoid the /quotanotify command via the bot, the user registers via the cluster directly
- e.g. run monitor-quota register --connector telegram --chat-id <my-chat-id>

`[messager]`

Process/service in the cluster handling messages from cluster to connector
Simply receives a message and forward it to a connector, e.g.: provide a command like [messager] --connector telegram|email|slack|etc --user chat-id|email-address|etc
Currently, goslmailer accomplishes this

`[connector]`

Service in the cluster handling messages from cluster to user
Handles the actual communication with the bot
Currently, tgslurmbot accomplishes this (also matrixslurmbot and discoslurmbot)
I'd think gobler can be a [connector] as well, that instead of connecting to a bot, it spools and forwards to another [connector] (like a decorator)

`flow-query`

This covers UC-1 (query current jobs from the telegram-bot)

The messaging here needs to be in both directions: [listener] <--> [connector] <--> [user].

Though notice the flow is always initiated by the user: [user] --> [connector] --> [listener] --> [connector] --> [user]

`[listener]`

Service in the cluster listening queries, somewhat similar to the [monitor] from before
Requires authentication :warning:
- For example, require receiving a secret token (unique per cluster-user, generated randomly once)
- If the token is incorrect, do not run any command in the cluster and return an error message
To implement UC-1 we'd setup a script that runs squeue, parses the output, and returns in a serialized format
- The [listener] can be called like: slurm-listener list-jobs
SLURM's Rest API might work to this end (?)

`[connector]`

Service in the cluster receiving user queries from the bot
Requires the user to be authenticated :warning:
- Stores a mapping chat-id --> token
Listen to a set of commands and call the [listener]
- e.g. listen to /jobs, call slurm-listener list-jobs, return the result via the chat

Security

Authentication can get complicated and risky. Some thoughts:

Safety first

Implement only safe queries in the [listener], e.g.:
- squeue: seems safe enough, query system state and returns
- scancel: seems unsafe! (modifies system state). Do not implement at all (or implement at your own risk)
Limit queries per minute --> avoids flooding DOS attack

Authentication flow (initial idea)

To authenticate, the user must:

generate the token once by running directly in the cluster: generate-token. This returns the token to stdout, e.g. 123456789, and stores it somewhere safe (e.g. /home/<username>/.mysecrettoken, no read/write access for other cluster-users)
send this command to the bot once: /auth 123456789. The [connector] stores the mapping chat-id --> token somewhere safe in the cluster (no read/write access to cluster-users). After that, the user can send commands through the bot to run authenticated queries.

To run authenticated commands: the [listener] first validates the token by calling validate-token <cluster-uid> <token>, and only then runs the actual query

Putting all together

Both flows are different enough to be treated separately, however, the [connector] for both flows must be the same service (at least for a telegram bot).

A simplified scheme would go as this:

[monitor] --> [messager] --\
                            \
                             --> [connector] <--> [user]
                            /
             [listener] <--/

A more detailed scheme (blue are flow-notify, and orange are for flow-query):

tgslurmbot-diagram

The flow-notify is already covered by goslmailer :tada:. Some questions:

Can we reuse the [messager] for other use-cases with the flow-notify?
- Say I develop a [monitor] to check user quotas periodically (ideally, this would be configurable to support multiple FSs)
- Can I call something like goslmailer <chat-id> <msg> --connector telegram? Can we extend goslmailer to support this?
- Alternatively, can I run tgslurmbot <chat-id> <msg> directly? This would not support other connectors or the gobler, but would cover my use-case with telegram

Can we extend the [connector] to support the flow-query?

Say I implement a [listener] for SLURM, an [authenticator], and a [mapping]

Can I provide my own code to add more bot commands? Or maybe import basic configuration?: e.g.

import "configBot", "mapping"

// this applies the current configuration from tgslurmbot
b := configBot()

// add my own handlers
b.Handle("/jobs", func(c tele.Context) error {
  token := mapping.ChatId2Token(c.Chat().ID)
  response := run "slurm-listener list-jobs", pass token

  return c.Send(response)
})

b.Start()

Hey, that's quite a thorough planning. Respect. In theory, it's all doable, but i'd ask you for a day/two to reread and digest/think it all through before i reply.

CLIP-HPC / goslmailer

Extend telegram bot #26

`flow-notify`

`[monitor]`

`[messager]`

`[connector]`

`flow-query`

`[listener]`

`[connector]`

Security

Safety first

Authentication flow (initial idea)

Putting all together