Closed hariseldon99 closed 1 year ago
Hey, could you take a look/share here the log file and see if there is anything reported there? Be careful if you're using debug config not to give us too much info :)
Here is the output throughout the execution of a slurm batch job:
sh-4.4# cat /tmp/goslmailer.log
tgslurmbot:2023/02/26 08:48:58.327668 tgslurmbot.go:50: ======================= tgslurmbot start =======================================
tgslurmbot:2023/02/26 08:48:58.327705 version.go:11: ----------------------------------------
tgslurmbot:2023/02/26 08:48:58.327716 version.go:12: Version: v2.7.1
tgslurmbot:2023/02/26 08:48:58.327723 version.go:13: Build commit hash: d60a3ea6a0d1051bbcf6e2526d77a15904aa6581
tgslurmbot:2023/02/26 08:48:58.327729 version.go:14: ----------------------------------------
tgslurmbot:2023/02/26 08:48:58.327736 tgslurmbot.go:58: Starting: "testbot"
Let's try the following:
scontrol show config|grep -i mail
/etc/slurm/goslmailer.conf
and readable by slurm user?/etc/slurm/telegramTemplate.html
readable by slurm user?Initial thought is that slurm is not invoking MailProg (goslmailer) for some reason, so let's try to see if that is true, and why.
This you might see in slurmctld.log on failure:
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: slurmctld: error: MailProg returned error, it's output was '2023/02/26 04:58:24 Initializing connector: discord
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: 2023/02/26 04:58:24 Initializing connector: mailto
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: 2023/02/26 04:58:24 Initializing connector: matrix
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: 2023/02/26 04:58:24 Initializing connector: mattermost
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: 2023/02/26 04:58:24 Initializing connector: msteams
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: 2023/02/26 04:58:24 Initializing connector: slack
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: 2023/02/26 04:58:24 Initializing connector: telegram
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: panic: runtime error: invalid memory address or nil pointer dereference
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4e8729]
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: goroutine 1 [running]:
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: log.(*Logger).Output(0x0, 0x20, {0xc000083140, 0x51})
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: /opt/hostedtoolcache/go/1.17.13/x64/src/log/log.go:165 +0x89
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: log.(*Logger).Fatalf(0xc0000c5050, {0xddcfb6, 0xdb25af}, {0xc00033ff50, 0x0, 0xc00004c6e0})
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: /opt/hostedtoolcache/go/1.17.13/x64/src/log/log.go:210 +0x4c
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: main.main()
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: /home/runner/work/goslmailer/goslmailer/cmd/goslmailer/goslmailer.go:47 +0x1c5
Feb 26 04:58:24 ctl-0.local.lan slurmctld[867]: '
Make sure that:
The instance above was caused by log file owned by root and not writable by slurm
user (0644 mode).
I will do a PR to fix this panic.
In your case, if you've run the tgslurmbot binary as root user first, it might have created the log file root:root without +w for others, so later slurm user with which goslmailer is run might not have been able to write to it and it would have panicked?
Many thanks for your attention.
So I fired up the old docker-compose and tailed the slurmctld.log as I submitted a generic MPI job. Sure enough, errors galore!
I redacted the debug messages to not clutter the post. The full logs are here on pastebin.
# scontrol show config|grep -i mail
MailDomain = (null)
MailProg = /usr/local/bin/goslmailer
sh-4.4# tail -f /var/log/slurm/slurmctld.log
[2023-02-26T15:22:23.706] _slurm_rpc_submit_batch_job: JobId=155 InitPrio=4294901754 usec=357
[2023-02-26T15:22:27.302] error: MailProg returned error, it's output was '2023/02/26 15:22:26 Initializing connector: discord
2023/02/26 15:22:26 Initializing connector: mailto
2023/02/26 15:22:26 Initializing connector: matrix
2023/02/26 15:22:27 Initializing connector: mattermost
2023/02/26 15:22:27 Initializing connector: msteams
2023/02/26 15:22:27 Initializing connector: slack
2023/02/26 15:22:27 Initializing connector: telegram
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4e8729]
goroutine 1 [running]:
log.(*Logger).Output(0x0, 0x20, {0xc00010e120, 0x51})
/opt/hostedtoolcache/go/1.17.13/x64/src/log/log.go:165 +0x89
log.(*Logger).Fatalf(0xc00016c180, {0xddcfb6, 0xdb25af}, {0xc00063ff50, 0xc000100570, 0x40ecc7})
/opt/hostedtoolcache/go/1.17.13/x64/src/log/log.go:210 +0x4c
main.main()
/home/runner/work/goslmailer/goslmailer/cmd/goslmailer/goslmailer.go:47 +0x1c5
'
[2023-02-26T15:22:36.166] _job_complete: JobId=155 WEXITSTATUS 0
[2023-02-26T15:22:36.418] _job_complete: JobId=155 done
[2023-02-26T15:22:36.676] error: MailProg returned error, it's output was '2023/02/26 15:22:36 Initializing connector: discord
2023/02/26 15:22:36 Initializing connector: mailto
2023/02/26 15:22:36 Initializing connector: matrix
2023/02/26 15:22:36 Initializing connector: mattermost
2023/02/26 15:22:36 Initializing connector: msteams
2023/02/26 15:22:36 Initializing connector: slack
2023/02/26 15:22:36 Initializing connector: telegram
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4e8729]
goroutine 1 [running]:
log.(*Logger).Output(0x0, 0x20, {0xc00003d1a0, 0x51})
/opt/hostedtoolcache/go/1.17.13/x64/src/log/log.go:165 +0x89
log.(*Logger).Fatalf(0xc000037110, {0xddcfb6, 0xdb25af}, {0xc0003bff50, 0xc00005d170, 0x40ecc7})
/opt/hostedtoolcache/go/1.17.13/x64/src/log/log.go:210 +0x4c
main.main()
/home/runner/work/goslmailer/goslmailer/cmd/goslmailer/goslmailer.go:47 +0x1c5
'
/home/runner/work/goslmailer/goslmailer/cmd/goslmailer/goslmailer.go:47 +0x1c5
This looks like you're hitting exactly what i have described above.
Do this:
goslmailer.conf
and tgslurmbot.conf
(do not symlink them)logfile
lines to point to two separate files, e.g.
"logfile": "/tmp/goslmailer.log",
&
"logfile": "/tmp/tgslurmbot.log",
Then try it out and let me know how that worked.
Hi,
Update to update: Turns out this was trivial. wgetting the markdown example from github and replavcing the malformatted html with that one solved the issue.
Thanks for the insight. I think you can close this issue now unless there is anything else that needs to be addressed.
I reworked the log file permissions issue by adding slurm user to root group and g+w on the log files. It seems to have solved the previous problem, but has highlighted a new one. Is the telegram template file misconfigured?
[2023-02-26T17:56:37.182] error: MailProg returned error, it's output was '2023/02/26 17:56:37 Initializing connector: discord
2023/02/26 17:56:37 Initializing connector: mailto
2023/02/26 17:56:37 Initializing connector: matrix
2023/02/26 17:56:37 Initializing connector: mattermost
2023/02/26 17:56:37 Initializing connector: msteams
2023/02/26 17:56:37 Initializing connector: slack
2023/02/26 17:56:37 Initializing connector: telegram
panic: template: /etc/slurm/telegramTemplate.html:798: function "className" not defined
goroutine 1 [running]:
html/template.Must(...)
/opt/hostedtoolcache/go/1.17.13/x64/src/html/template/template.go:374
github.com/CLIP-HPC/goslmailer/internal/renderer.RenderTemplate({0xc0004e2200, 0xc0004eecf0}, {0xc0004d37cc, 0x4}, 0xc0004da700, {0x7fff8dc2adcf, 0xa}, 0xc0006bfcb0)
/home/runner/work/goslmailer/goslmailer/internal/renderer/renderer.go:40 +0x55a
github.com/CLIP-HPC/goslmailer/connectors/telegram.(*Connector).SendMessage(0xc0004e6000, 0xc0004d0f00, 0x1, 0x8)
/home/runner/work/goslmailer/goslmailer/connectors/telegram/telegram.go:81 +0x455
main.main()
/home/runner/work/goslmailer/goslmailer/cmd/goslmailer/goslmailer.go:96 +0x557
Yeah, so I split them. There are now two config files in /etc/slurm/
# cat /etc/slurm/tgslurmbot.conf
{
"logfile": "/tmp/tgslurmbot.log",
"debugconfig": true,
"binpaths": {
"sacct":"/usr/bin/sacct",
"sstat":"/usr/bin/sstat"
},
"defaultconnector": "telegram",
"connectors": {
"telegram": {
"name": "testbot",
"url": "",
"token": "everythingisstillfubar",
"renderToFile": "no",
"spoolDir": "/tmp/telegramgobs",
"messageTemplate": "/etc/slurm/telegramTemplate.html",
"useLookup": "no",
"format": "HTML"
}
},
"qosmap": {
"elevated": 3600,
"normal": 28800
}
}
# cat /etc/slurm/goslmailer.conf
{
"logfile": "/tmp/goslmailer.log",
"debugconfig": true,
"binpaths": {
"sacct":"/usr/bin/sacct",
"sstat":"/usr/bin/sstat"
},
"defaultconnector": "msteams",
"connectors": {
"msteams": {
"name": "dev channel",
"renderToFile": "yes",
"spoolDir": "/tmp",
"url": "https://msteams/webhook/url",
"adaptiveCardTemplate": "/path/template.json",
"useLookup": "GECOS"
},
"mailto": {
"name": "original slurm mail functionality, extended.",
"mailCmd": "/usr/bin/mutt",
"mailCmdParams": "-s \"Job {{ .SlurmEnvironment.SLURM_JOB_ID }} ({{ .SlurmEnvironment.SLURM_JOB_NAME }}) {{ .SlurmEnvironment.SLURM_JOB_MAIL_TYPE }}\"",
"mailTemplate": "/etc/slurm/mailTemplate.tmpl",
"mailFormat": "HTML",
"allowList": ".+@(imp|imba.oeaw|gmi.oeaw).ac.at"
},
"telegram": {
"name": "telegram bot",
"url": "",
"token": "everythingisstillfubar",
"renderToFile": "no",
"spoolDir": "/tmp/telegramgobs",
"messageTemplate": "/etc/slurm/telegramTemplate.html",
"useLookup": "no",
"format": "HTML"
},
"discord": {
"name": "DiscoSlurmBot",
"triggerString": "showmeslurm",
"token": "PasteBotTokenHere",
"messageTemplate": "/path/to/template.md"
},
"mattermost": {
"name": "MatTheSlurmBot",
"serverUrl": "https://someSpaceName.cloud.mattermost.com",
"wsUrl": "wss://someSpaceName.cloud.mattermost.com",
"token": "PasteBotTokenHere",
"triggerString": "showmeslurm",
"messageTemplate" : "/path/to/mattermostTemplate.md"
},
"matrix": {
"username": "@myuser:matrix.org",
"token": "syt_dGRpZG9ib3QXXXXXXXEyQMBEmvOVp_10Jm93",
"homeserver": "matrix.org",
"template": "/path/to/matrix_template.md"
},
"slack": {
"token": "PasteSlackBotTokenHere",
"messageTemplate": "/path/to/template.md",
"renderToFile": "spool",
"spoolDir": "/tmp"
},
"textfile": {
"path": "/tmp"
}
},
"qosmap": {
"elevated": 3600,
"normal": 28800
}
}
Same errors in slurmctld.log
[2023-02-26T17:36:21.483] _slurm_rpc_submit_batch_job: JobId=158 InitPrio=4294901757 usec=325
[2023-02-26T17:36:22.340] error: MailProg returned error, it's output was '2023/02/26 17:36:22 Initializing connector: discord
2023/02/26 17:36:22 Initializing connector: mailto
2023/02/26 17:36:22 Initializing connector: matrix
2023/02/26 17:36:22 Initializing connector: mattermost
2023/02/26 17:36:22 Initializing connector: msteams
2023/02/26 17:36:22 Initializing connector: slack
2023/02/26 17:36:22 Initializing connector: telegram
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4e8729]
goroutine 1 [running]:
log.(*Logger).Output(0x0, 0x20, {0xc00003c060, 0x51})
/opt/hostedtoolcache/go/1.17.13/x64/src/log/log.go:165 +0x89
log.(*Logger).Fatalf(0xc000036180, {0xddcfb6, 0xdb25af}, {0xc0006bff50, 0xc000090570, 0x40ecc7})
/opt/hostedtoolcache/go/1.17.13/x64/src/log/log.go:210 +0x4c
main.main()
/home/runner/work/goslmailer/goslmailer/cmd/goslmailer/goslmailer.go:47 +0x1c5
'
[2023-02-26T17:36:23.806] error: MailProg returned error, it's output was '2023/02/26 17:36:23 Initializing connector: discord
2023/02/26 17:36:23 Initializing connector: mailto
2023/02/26 17:36:23 Initializing connector: matrix
2023/02/26 17:36:23 Initializing connector: mattermost
2023/02/26 17:36:23 Initializing connector: msteams
2023/02/26 17:36:23 Initializing connector: slack
2023/02/26 17:36:23 Initializing connector: telegram
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4e8729]
goroutine 1 [running]:
log.(*Logger).Output(0x0, 0x20, {0xc00003c060, 0x51})
/opt/hostedtoolcache/go/1.17.13/x64/src/log/log.go:165 +0x89
log.(*Logger).Fatalf(0xc000036180, {0xddcfb6, 0xdb25af}, {0xc00063ff50, 0xc000090570, 0x40ecc7})
/opt/hostedtoolcache/go/1.17.13/x64/src/log/log.go:210 +0x4c
main.main()
/home/runner/work/goslmailer/goslmailer/cmd/goslmailer/goslmailer.go:47 +0x1c5
'
logfiles:
# cat /tmp/goslmailer.log
tgslurmbot:2023/02/26 08:48:58.327668 tgslurmbot.go:50: ======================= tgslurmbot start =======================================
tgslurmbot:2023/02/26 08:48:58.327705 version.go:11: ----------------------------------------
tgslurmbot:2023/02/26 08:48:58.327716 version.go:12: Version: v2.7.1
tgslurmbot:2023/02/26 08:48:58.327723 version.go:13: Build commit hash: d60a3ea6a0d1051bbcf6e2526d77a15904aa6581
tgslurmbot:2023/02/26 08:48:58.327729 version.go:14: ----------------------------------------
tgslurmbot:2023/02/26 08:48:58.327736 tgslurmbot.go:58: Starting: "testbot"
tgslurmbot:2023/02/26 08:52:55.231591 tgslurmbot.go:50: ======================= tgslurmbot start =======================================
tgslurmbot:2023/02/26 08:52:55.231631 version.go:11: ----------------------------------------
tgslurmbot:2023/02/26 08:52:55.231641 version.go:12: Version: v2.7.1
tgslurmbot:2023/02/26 08:52:55.231659 version.go:13: Build commit hash: d60a3ea6a0d1051bbcf6e2526d77a15904aa6581
tgslurmbot:2023/02/26 08:52:55.231663 version.go:14: ----------------------------------------
tgslurmbot:2023/02/26 08:52:55.231668 tgslurmbot.go:58: Starting: "testbot"
tgslurmbot:2023/02/26 08:53:15.423147 tgslurmbot.go:50: ======================= tgslurmbot start =======================================
tgslurmbot:2023/02/26 08:53:15.423182 version.go:11: ----------------------------------------
tgslurmbot:2023/02/26 08:53:15.423191 version.go:12: Version: v2.7.1
tgslurmbot:2023/02/26 08:53:15.423195 version.go:13: Build commit hash: d60a3ea6a0d1051bbcf6e2526d77a15904aa6581
tgslurmbot:2023/02/26 08:53:15.423198 version.go:14: ----------------------------------------
tgslurmbot:2023/02/26 08:53:15.423202 tgslurmbot.go:58: Starting: "testbot"
tgslurmbot:2023/02/26 15:21:20.191616 tgslurmbot.go:50: ======================= tgslurmbot start =======================================
tgslurmbot:2023/02/26 15:21:20.193793 version.go:11: ----------------------------------------
tgslurmbot:2023/02/26 15:21:20.193957 version.go:12: Version: v2.7.1
tgslurmbot:2023/02/26 15:21:20.193990 version.go:13: Build commit hash: d60a3ea6a0d1051bbcf6e2526d77a15904aa6581
tgslurmbot:2023/02/26 15:21:20.194019 version.go:14: ----------------------------------------
tgslurmbot:2023/02/26 15:21:20.194051 tgslurmbot.go:58: Starting: "testbot"
goslmailer:2023/02/26 17:35:50.057620 goslmailer.go:50: ======================== START OF RUN ==========================================
goslmailer:2023/02/26 17:35:50.057867 version.go:11: ----------------------------------------
goslmailer:2023/02/26 17:35:50.057877 version.go:12: Version: v2.7.1
goslmailer:2023/02/26 17:35:50.057881 version.go:13: Build commit hash: d60a3ea6a0d1051bbcf6e2526d77a15904aa6581
goslmailer:2023/02/26 17:35:50.057884 version.go:14: ----------------------------------------
goslmailer:2023/02/26 17:35:50.057886 config.go:78: DUMP CONFIG:
goslmailer:2023/02/26 17:35:50.057983 config.go:79: CONFIGURATION: &config.ConfigContainer{DebugConfig:true, Logfile:"/tmp/goslmailer.log", Binpaths:map[string]string{"sacct":"/usr/bin/sacct", "sstat":"/usr/bin/sstat"}, DefaultConnector:"msteams", Connectors:map[string]map[string]string{"discord":map[string]string{"messageTemplate":"/path/to/template.md", "name":"DiscoSlurmBot", "token":"PasteBotTokenHere", "triggerString":"showmeslurm"}, "mailto":map[string]string{"allowList":".+@(imp|imba.oeaw|gmi.oeaw).ac.at", "mailCmd":"/usr/bin/mutt", "mailCmdParams":"-s \"Job {{ .SlurmEnvironment.SLURM_JOB_ID }} ({{ .SlurmEnvironment.SLURM_JOB_NAME }}) {{ .SlurmEnvironment.SLURM_JOB_MAIL_TYPE }}\"", "mailFormat":"HTML", "mailTemplate":"/etc/slurm/mailTemplate.tmpl", "name":"original slurm mail functionality, extended."}, "matrix":map[string]string{"homeserver":"matrix.org", "template":"/path/to/matrix_template.md", "token":"syt_dGRpZG9ib3QXXXXXXXEyQMBEmvOVp_10Jm93", "username":"@myuser:matrix.org"}, "mattermost":map[string]string{"messageTemplate":"/path/to/mattermostTemplate.md", "name":"MatTheSlurmBot", "serverUrl":"https://someSpaceName.cloud.mattermost.com", "token":"PasteBotTokenHere", "triggerString":"showmeslurm", "wsUrl":"wss://someSpaceName.cloud.mattermost.com"}, "msteams":map[string]string{"adaptiveCardTemplate":"/path/template.json", "name":"dev channel", "renderToFile":"yes", "spoolDir":"/tmp", "url":"https://msteams/webhook/url", "useLookup":"GECOS"}, "slack":map[string]string{"messageTemplate":"/path/to/template.md", "renderToFile":"spool", "spoolDir":"/tmp", "token":"PasteSlackBotTokenHere"}, "telegram":map[string]string{"format":"HTML", "messageTemplate":"/etc/slurm/telegramTemplate.html", "name":"telegram bot", "renderToFile":"no", "spoolDir":"/tmp/telegramgobs", "token":"5844013197:AAHQmCJFmLMD0y78g8dxGXhzxd5XwYLe4pw", "url":"", "useLookup":"no"}, "textfile":map[string]string{"path":"/tmp"}}, QosMap:map[string]uint64{"elevated":0xe10, "normal":0x7080}}
goslmailer:2023/02/26 17:35:50.058006 config.go:80: CONFIGURATION logfile: /tmp/goslmailer.log
goslmailer:2023/02/26 17:35:50.058014 config.go:81: --------------------------------------------------------------------------------
goslmailer:2023/02/26 17:35:50.058021 invocation_context.go:34: Parsing CMDLine:
goslmailer:2023/02/26 17:35:50.058025 invocation_context.go:35: CMD subject: "Default Blank Subject"
goslmailer:2023/02/26 17:35:50.058028 invocation_context.go:36: CMD others: []string{}
goslmailer:2023/02/26 17:35:50.058031 invocation_context.go:37: --------------------------------------------------------------------------------
goslmailer:2023/02/26 17:35:50.058034 invocation_context.go:41: DUMP RECEIVERS:
goslmailer:2023/02/26 17:35:50.058039 invocation_context.go:42: Receivers: main.Receivers(nil)
goslmailer:2023/02/26 17:35:50.058048 invocation_context.go:43: invocationContext: &main.invocationContext{CmdParams:main.CmdParams{Subject:"Default Blank Subject", Other:[]string{}}, Receivers:main.Receivers(nil)}
goslmailer:2023/02/26 17:35:50.058051 invocation_context.go:44: --------------------------------------------------------------------------------
goslmailer:2023/02/26 17:35:50.058056 getjobcontext.go:235: Start retrieving job stats
goslmailer:2023/02/26 17:35:50.058070 getjobcontext.go:236: slurmjob.SlurmEnvironment{SLURM_ARRAY_JOB_ID:"", SLURM_ARRAY_TASK_COUNT:"", SLURM_ARRAY_TASK_ID:"", SLURM_ARRAY_TASK_MAX:"", SLURM_ARRAY_TASK_MIN:"", SLURM_ARRAY_TASK_STEP:"", SLURM_CLUSTER_NAME:"", SLURM_JOB_ACCOUNT:"", SLURM_JOB_DERIVED_EC:"", SLURM_JOB_EXIT_CODE:"", SLURM_JOB_EXIT_CODE2:"", SLURM_JOB_EXIT_CODE_MAX:"", SLURM_JOB_EXIT_CODE_MIN:"", SLURM_JOB_GID:"", SLURM_JOB_GROUP:"", SLURM_JOBID:"", SLURM_JOB_ID:"", SLURM_JOB_MAIL_TYPE:"", SLURM_JOB_NAME:"", SLURM_JOB_NODELIST:"", SLURM_JOB_PARTITION:"", SLURM_JOB_QUEUED_TIME:"", SLURM_JOB_RUN_TIME:"", SLURM_JOB_STATE:"", SLURM_JOB_STDIN:"", SLURM_JOB_UID:"", SLURM_JOB_USER:"", SLURM_JOB_WORK_DIR:""}
goslmailer:2023/02/26 17:35:50.058886 goslmailer.go:71: Unable to retrieve job stats. Error: Invalid subject line: Default Blank Subject
# cat /tmp/tgslurmbot.log
tgslurmbot:2023/02/26 17:25:02.858693 tgslurmbot.go:50: ======================= tgslurmbot start =======================================
tgslurmbot:2023/02/26 17:25:02.859473 version.go:11: ----------------------------------------
tgslurmbot:2023/02/26 17:25:02.860025 version.go:12: Version: v2.7.1
tgslurmbot:2023/02/26 17:25:02.860029 version.go:13: Build commit hash: d60a3ea6a0d1051bbcf6e2526d77a15904aa6581
tgslurmbot:2023/02/26 17:25:02.860033 version.go:14: ----------------------------------------
tgslurmbot:2023/02/26 17:25:02.860038 tgslurmbot.go:58: Starting: "testbot"
tgslurmbot:2023/02/26 17:32:18.119490 tgslurmbot.go:50: ======================= tgslurmbot start =======================================
tgslurmbot:2023/02/26 17:32:18.119540 version.go:11: ----------------------------------------
tgslurmbot:2023/02/26 17:32:18.119554 version.go:12: Version: v2.7.1
tgslurmbot:2023/02/26 17:32:18.119559 version.go:13: Build commit hash: d60a3ea6a0d1051bbcf6e2526d77a15904aa6581
tgslurmbot:2023/02/26 17:32:18.119564 version.go:14: ----------------------------------------
tgslurmbot:2023/02/26 17:32:18.119570 tgslurmbot.go:58: Starting: "testbot"
tgslurmbot:2023/02/26 17:33:36.081803 tgslurmbot.go:50: ======================= tgslurmbot start =======================================
tgslurmbot:2023/02/26 17:33:36.081842 version.go:11: ----------------------------------------
tgslurmbot:2023/02/26 17:33:36.081852 version.go:12: Version: v2.7.1
tgslurmbot:2023/02/26 17:33:36.081856 version.go:13: Build commit hash: d60a3ea6a0d1051bbcf6e2526d77a15904aa6581
tgslurmbot:2023/02/26 17:33:36.081864 version.go:14: ----------------------------------------
tgslurmbot:2023/02/26 17:33:36.081869 tgslurmbot.go:58: Starting: "testbot"
By the way, dunno if this is relevant, but the '/home/runner/work/goslmailer/goslmailer/cmd/goslmailer/goslmailer.go' path referenced in the logs does not really exist.
Ok, config files now look better.
goslmailer:2023/02/26 17:35:50.058014 config.go:81: --------------------------------------------------------------------------------
goslmailer:2023/02/26 17:35:50.058021 invocation_context.go:34: Parsing CMDLine:
goslmailer:2023/02/26 17:35:50.058025 invocation_context.go:35: CMD subject: "Default Blank Subject"
goslmailer:2023/02/26 17:35:50.058028 invocation_context.go:36: CMD others: []string{}
goslmailer:2023/02/26 17:35:50.058031 invocation_context.go:37: --------------------------------------------------------------------------------
This tells me you've just invoked goslmailer manually with no switches on command line.
Try submitting:
sbatch --mail-type=ALL --mail-user='telegram:YOURID' --wrap='sleep 60'
And show me the log.
When slurm invokes goslmailer, it looks something like this:
goslmailer:2023/02/26 13:01:19.004375 config.go:81: --------------------------------------------------------------------------------
goslmailer:2023/02/26 13:01:19.004498 invocation_context.go:34: Parsing CMDLine:
goslmailer:2023/02/26 13:01:19.004512 invocation_context.go:35: CMD subject: "Slurm Job_id=37 Name=wrap Ended, Run time 00:01:00, COMPLETED, ExitCode 0"
goslmailer:2023/02/26 13:01:19.004518 invocation_context.go:36: CMD others: []string{"telegram:XXX"}
goslmailer:2023/02/26 13:01:19.004523 invocation_context.go:37: --------------------------------------------------------------------------------
Hey, glad to hear you've made it!
Just one question before i close this. Which template file did you use in the first place to get this error message? telegram.html from the release zip? Was it modified before deployment? Later i'll do a release that will address the logger panic and replace it with a more descriptive log message.
Update to update: Turns out this was trivial. wgetting the markdown example from github and replavcing the malformatted html with that one solved the issue.
Thanks for the insight. I think you can close this issue now unless there is anything else that needs to be addressed.
Old Update
I reworked the log file permissions issue by adding slurm user to root group and g+w on the log files. It seems to have solved the previous problem, but has highlighted a new one. Is the telegram template file misconfigured?
[2023-02-26T17:56:37.182] error: MailProg returned error, it's output was '2023/02/26 17:56:37 Initializing connector: discord 2023/02/26 17:56:37 Initializing connector: mailto 2023/02/26 17:56:37 Initializing connector: matrix 2023/02/26 17:56:37 Initializing connector: mattermost 2023/02/26 17:56:37 Initializing connector: msteams 2023/02/26 17:56:37 Initializing connector: slack 2023/02/26 17:56:37 Initializing connector: telegram panic: template: /etc/slurm/telegramTemplate.html:798: function "className" not defined goroutine 1 [running]: html/template.Must(...) /opt/hostedtoolcache/go/1.17.13/x64/src/html/template/template.go:374 github.com/CLIP-HPC/goslmailer/internal/renderer.RenderTemplate({0xc0004e2200, 0xc0004eecf0}, {0xc0004d37cc, 0x4}, 0xc0004da700, {0x7fff8dc2adcf, 0xa}, 0xc0006bfcb0) /home/runner/work/goslmailer/goslmailer/internal/renderer/renderer.go:40 +0x55a github.com/CLIP-HPC/goslmailer/connectors/telegram.(*Connector).SendMessage(0xc0004e6000, 0xc0004d0f00, 0x1, 0x8) /home/runner/work/goslmailer/goslmailer/connectors/telegram/telegram.go:81 +0x455 main.main() /home/runner/work/goslmailer/goslmailer/cmd/goslmailer/goslmailer.go:96 +0x557
Hey, glad to hear you've made it!
Just one question before i close this. Which template file did you use in the first place to get this error message? telegram.html from the release zip? Was it modified before deployment? Later i'll do a release that will address the logger panic and replace it with a more descriptive log message.
Yes. It was from the html file in the release zip, and no, I did not change anything.
Update to update: Turns out this was trivial. wgetting the markdown example from github and replavcing the malformatted html with that one solved the issue. Thanks for the insight. I think you can close this issue now unless there is anything else that needs to be addressed.
Old Update
I reworked the log file permissions issue by adding slurm user to root group and g+w on the log files. It seems to have solved the previous problem, but has highlighted a new one. Is the telegram template file misconfigured?
[2023-02-26T17:56:37.182] error: MailProg returned error, it's output was '2023/02/26 17:56:37 Initializing connector: discord 2023/02/26 17:56:37 Initializing connector: mailto 2023/02/26 17:56:37 Initializing connector: matrix 2023/02/26 17:56:37 Initializing connector: mattermost 2023/02/26 17:56:37 Initializing connector: msteams 2023/02/26 17:56:37 Initializing connector: slack 2023/02/26 17:56:37 Initializing connector: telegram panic: template: /etc/slurm/telegramTemplate.html:798: function "className" not defined goroutine 1 [running]: html/template.Must(...) /opt/hostedtoolcache/go/1.17.13/x64/src/html/template/template.go:374 github.com/CLIP-HPC/goslmailer/internal/renderer.RenderTemplate({0xc0004e2200, 0xc0004eecf0}, {0xc0004d37cc, 0x4}, 0xc0004da700, {0x7fff8dc2adcf, 0xa}, 0xc0006bfcb0) /home/runner/work/goslmailer/goslmailer/internal/renderer/renderer.go:40 +0x55a github.com/CLIP-HPC/goslmailer/connectors/telegram.(*Connector).SendMessage(0xc0004e6000, 0xc0004d0f00, 0x1, 0x8) /home/runner/work/goslmailer/goslmailer/connectors/telegram/telegram.go:81 +0x455 main.main() /home/runner/work/goslmailer/goslmailer/cmd/goslmailer/goslmailer.go:96 +0x557
As part of discussion here: #33 i've tried out the telegram HTML template from release zip and it worked ok. Will close this now since the rest of the issue was solved.
Hi,
Using version 2.7.1 from releases.
Trying to test tgslurmbot in this fake docker cluster before putting it in my real one, but I am encountering issues.
Mailprog in slurm.conf is set:
Here is my config for tgslurmbot.conf/goslmailer.conf (same file, symlinked)
Here is the bot in telegram, seemingly working when I sent /start to it after running
I submitted this test code to slurm with sbatch:
Code ran correctly, as evidenced by this output
But no response from the bot. Nada.
Slurm version is: