Closed jspc closed 7 months ago
Hang on, something doesn't look right. Closing; will reopen
I hadn't realised that the task.Clone()
function used hardcoded fields and so I wasn't seeing what I expected. Updated, reopened
Great work! Couple of points:
workingdir
and files
? one option is to use it in favor of the default /tork
workdir.workdir
instead of workingdir
so it's a little more concise?input.Task
as well.Yeah no worries, I'll tackle this afternoon.
With regards to the points:
/tork
whether or not a working directory was set or not, especially since/tork
has the stdout file (which I neither fully understand enough to change, nor want to store in a directory I'm loading to/from disk)- I want to take direction on this, since you know this project better than absolutely anyone, but I suspect this requires some thought.workdir
variable and the WorkingDir
value set in the container.Config
struct; I also think they might solve different problems, but I'm not sure I understand what Tasks.Files
does or where it's used stdout
internal and out of the way of the user. For additional context, files
are just a convenient means to create arbitrary files in the /tork
workdir, that you can then refer to from the run
script:name: Get the post
image: python:3
files:
script.py: |
import requests
url = "https://jsonplaceholder.typicode.com/posts/1"
response = requests.get(url)
data = response.json()
print(data['title'])
run: |
pip install requests
python script.py > $TORK_OUTPUT
Possibly, what we're looking at here is having two directories: one internal for Tork itself to have the stdout
and entrypoint
files and another "userspace" working directory.
Task
API. Even Docker itself is a little confusing since the Dockerfile
directive is WORKDIR
but their API is WorkingDir
. In all honestly, the difference is minor, but I figure that since workdir
describes what it means unambiguously there's no need to spell it out further.Perfect!
I'll respond to the second point first, since it's simplest: great point, works for me! Workdir it is. (To solidify the lack of consistency across everything even more, os/exec.Cmd
just calls it Dir
🙄 )
Otherwise, how does this design sound?
workdir
for the docker runtime is something like /userspace
, or /work
or something (we can't set it to whatever the default working dir of a container is, or we risk blatting away scripts and files already in there I think)Then we have mostly the same flow as you had before- if files are set, add them to whatever workdir
is, based on the above and always set the container working dir. Otherwise only set the container working dir if workdir != DefaultWorkdir
Yea, that sounds great. We just need to make sure that files
are written to the workdir since the user will be using relative paths in their run
script. For the workdir
name we could probably use /tork/workdir
so everything is kept nicely in one folder hierarchy.
Nice one, commits incoming
I'm not hugely happy with my solution here, but if you think I'm just being unduly harsh on myself then merge away!
Otherwise I'll tackle tomorrow morning, I'm off down the pub for the football instead.
Essentially: shifting Files
into a subdirectory (either /tork/workspace
or user specified) meant revisiting the tar writer. Of course, if the workdir is under /tork
still then great: you only need to prepend that path to filenames in the tar. If not, you need two tars.
The combinatrices started getting confusing, so I wrote it to always produce two tars. It's slightly less efficient, but not hugely.
If that's an issue then I'll come back to this
This looks right on first glance. Will take a better look on the weekend. I'm moving tomorrow, so will definitely try and look after I'm settled in a bit.
Cool, no worries - there's absolutely no rush on any of this, It Works On My Fork ™️ and that's more than good enough for me to continue doing what I'm doing!
Actually, it's a good job we didn't merge. Just tried this on another machine and got a weird error where the entrypoint file didn't exist.
Turns out that I was trying to be a bit clever by passing an *os.File
about, rather than closing and re-opening like the original implementation did. But actually, in certain environments/ OSes at least, doing this flushes waiting writes to disk that might not have been flushed yet. And reading from a file doesn't take that cache into account on all platforms (think of it like READ UNCOMMITTED
in the sql spec, and how this is ignored in postgres).
So for now I've done the close/ reopen.
I did draft a change to not bother about disks at all, and pass a bytes.Buffer
around, but I have no idea how much memory tork instances might have (it might be a tiny 256mb instance talking to a remote docker host for all I have) and how large an entrypoint/ Files
could be and so writing to disks and letting go handle seeking seemed smarter than sticking it all in RAM.
(Plus I've found that commenting out the line to delete the archives after creation was really helpful to see what had been written to disk to debug writing archives and didn't want to lose that option again)
Superseded by https://github.com/runabol/tork/pull/368
Allow for the setting of workingdirs at the task level
This closes https://github.com/runabol/tork/issues/361, and allows for tasks to set the correct workingdir for a container (it is currently ignored for shell runtimes)- this helps avoid things like
cd /foo
in commands; it also has the side effect of creating directories under certain circumstances.Additionally, this PR contains an updated swagger.json to include the new task field; this command doesn't seem to have been run for a couple of changes and so it has pulled in some other changes.
Happy to remove that change and open a new PR if needed.
Edit to add: I haven't updated
input.Task
to include a working dir because I want to wait and see whether I should make that available in yaml; I also don't know where the docs live so I can document public interfaces like the yaml file