Open hermannschwaerzlerUIBK opened 2 years ago
This is my code:
Sorry for the hassle! I was able to fix the problem. In hindsight it was an obvious error: I had not added a
_ "github.com/CLIP-HPC/goslmailer/connectors/summary"
line in the import
section of cmd/gobler/gobler.go
:-(
After adding that everything works as expected!
Hey Hermann, sorry for this late answer, but somehow you've managed to hit the one week when the whole team overlapped out for vacation :disappointed: We're back now and if you have anything to ask (no hassle at all), we'll answer, much faster. Glad that you've managed to figure it out, the idea for the connector also sounds interesting, would be great if you PR it when it's finished, until then, if we can assist you anyhow with development, let us know.
How do you plan to get the workdir? Passed in via --mail-user=summary:/path/ manually or through slurm job env, we have the env for newer slurm versions, but we could extend it for older ones as well if you'd need it:
Dear @pja237,
thanks for getting in touch. In order to get the work-dir I do this in the configuration:
"connectors": {
"summary": {
"path": "{{ .SlurmEnvironment.SLURM_JOB_WORK_DIR }}/slurm-{{ .SlurmEnvironment.SLURM_JOB_ID }}.summary",
[...]
We are using Slurm version 22.05.x and with that this works perfectly.
I am using spooling in goslmailer and then do the actual work in gobler because goslmailer runs as user slurm
which is not privileged to write in everyones working-directories. :-)
But the gobler
-service can run as root and thus write those files.
I happen to have a few questions:
In a setup like mine in goslmailer.conf
for my connector I need only these lines, right?
"summary": {
"spoolDir": "/var/spool/goslmailer"
}
As the only thing I am doing in goslmailer
is to spool the "task".
And all the other configuration-settings go to gobler.conf
as they are needed only there, right?
To illustrate how I am doing things, here is my current state of SendMessage()
:
func (c *Connector) SendMessage(mp *message.MessagePack, useSpool bool, l *log.Logger) error {
var (
e error = nil
path = bytes.Buffer{}
body = bytes.Buffer{}
)
if useSpool {
err := spool.DepositToSpool(c.spoolDir, mp)
if err != nil {
l.Printf("DepositToSpool Failed!\n")
return err
}
} else {
// render destination path
p := template.Must(template.New("path").Parse(c.path))
e = p.Execute(&path, mp.JobContext)
if e != nil {
return e
}
// render body
err := renderer.RenderTemplate(c.template, "text", mp.JobContext, mp.TargetUser, &body)
if err != nil {
return err
}
// save body to file
err = os.WriteFile(path.String(), body.Bytes(), 0644)
if err != nil {
return err
}
// chown that file to uid and gid of the destination directory
splitPath := strings.Split(path.String(), "/")
workDir := strings.Join(splitPath[0:(len(splitPath) - 1)], "/")
fileInfo, err := os.Stat(workDir)
stat := fileInfo.Sys().(*syscall.Stat_t)
UID := int(stat.Uid)
GID := int(stat.Gid)
os.Chown(path.String(), UID, GID)
}
return e
}
There might be an easier solution for the last part (chowning the file), as it is (or at least looks) a bit tedious.
Regards, Hermann
Evening Hermann,
here are my toughts... :)
"connectors": { "summary": { "path": "{{ .SlurmEnvironment.SLURM_JOB_WORK_DIR }}/slurm-{{ .SlurmEnvironment.SLURM_JOB_ID }}.summary", [...]
This is great, outside of mail connectors command line, didn't think of a template use like this :)
I am using spooling in goslmailer and then do the actual work in gobler because goslmailer runs as user
slurm
which is not privileged to write in everyones working-directories. :-) But thegobler
-service can run as root and thus write those files.
Great (ab)use. :+1:
I happen to have a few questions: In a setup like mine in
goslmailer.conf
for my connector I need only these lines, right?"summary": { "spoolDir": "/var/spool/goslmailer" }
Yes, correct.
As the only thing I am doing in
goslmailer
is to spool the "task". And all the other configuration-settings go togobler.conf
as they are needed only there, right?
Exactly :+1:
func (c *Connector) SendMessage(mp *message.MessagePack, useSpool bool, l *log.Logger) error { [snip] // chown that file to uid and gid of the destination directory splitPath := strings.Split(path.String(), "/") workDir := strings.Join(splitPath[0:(len(splitPath) - 1)], "/") fileInfo, err := os.Stat(workDir) stat := fileInfo.Sys().(*syscall.Stat_t) UID := int(stat.Uid) GID := int(stat.Gid) os.Chown(path.String(), UID, GID)
Just a thought on this bit, but assuming you want to give the summary to the user submitting the job, wouldn't it be safe to assume that the he is also the owner, or at heast has some permissions on the working directory (the one from: .SlurmEnvironment.SLURM_JOB_WORK_DIR
).
Then perhaps you can just do the https://pkg.go.dev/os#Chown directly to him, without the whole tedious bits of testing?
e.g.
// this is just pseudo from head, needs error handling and string->int bits sorted out
os.Chown(path.String(), strconv.Atoi(mp.SlurmEnvironment.SLURM_JOB_UID), same_for_GID)
On a sidenote, i remember you were also interested in the matrix connector, did you perhaps try it out yet?
best, Petar
Hi Petar,
sorry for the long delay. Now it was me who was out of office for about a week. :-)
Thank you for your helpful comments and support.
Regarding the chown
: Yes I guess it should be safe to simplify that part to
UID, e = strconv.Atoi(mp.JobContext.SlurmEnvironment.SLURM_JOB_UID)
GID, e = strconv.Atoi(mp.JobContext.SlurmEnvironment.SLURM_JOB_GID)
os.Chown(path.String(), UID, GID)
(after having declared UID
and GID
as int
further up). I tested it in my environment and it works.
I will prepare a pull request soonish.
I am planning to add a README.md file to the corresponding subdirectory of connectors
to describe the necessary prerequisites for it (a spooling-directory, the necessity of running gobler as root and potentially a few lines in job_submit.lua
to make it work automagically).
Regarding the matrix connector: no unfortunately I hasn't been able to test it, yet. We are in the middle of getting our new cluster to production and I had to focus my priorities...
Regards Hermann
Hey Hermann, no worries, whenever you're ready with the PR, fire away. Feel free to add the job_submit lua as a usage example, that would also be great :) Good luck with the new cluster roll out.
best, Petar
Hey Hermann, just checking up on this issue, i hope all is working well with the new cluster. Did you manage to put this code to good use in the end?
I am trying to implement a connector that "abuses" goslmailer to write a summary into a file in the work-dir of a job.
I wrote it such that it uses spooling. The first part (writing some .gob file to the spooling directory works just fine. But running "gobler -c ..." gives me this output:
Any hints or ideas on how to debug this?