hail-is / hail

Cloud-native genomic dataframes and batch computing
https://hail.is
MIT License
984 stars 246 forks source link

[batch] Omit empty command strings when constructing shell scripts #14700

Closed jmarshall closed 1 month ago

jmarshall commented 1 month ago

We recently encountered jobs that failed due to syntax errors in the shell script generated by Hail, stemming from code such as

job.command('touch before')
job.command('\n'.join(f'echo {shlex.quote(msg)}' for msg in messages))
job.command('touch after')

Occasionally messages is an empty list, so this evaluates to job.command('') and the eventual shell script submitted by Hail contains

…
{
touch before
}
{

}
{
touch after
}
…

Shell compound commands like { … } must contain at least one command, so this is a syntax error.

Empty commands could be rewritten to generate e.g. : such as

…
{
:
}
…

but it seems easier and probably less surprising to just omit them.

jmarshall commented 1 month ago

(We've added if messages: … to the code in question to prevent this, but it would be good to fix the issue from the Hail end as well.)

cjllanwarne commented 1 month ago

@jmarshall another way to do this might be to ignore/do nothing during the initial call of job.command('') when the argument is empty. Assuming we never want to do anything for empty/None commands we might as well eliminate them at source. That way you should only have to handle the situation in one place, and not have to handle the invalid data structure in multiple downstream places.

I do like the idea of fixing this in the hail library, but you could also consider altering your upstream code so that it never makes an invalid "job.command('')" call in the first place. Eg by doing something like

job.command('touch before')
for msg in messages:
  job.command(f'echo {shlex.quote(msg)})
job.command('touch after')

(assuming it doesn't matter to you whether the messages are handled in the same section of the resulting command or not)

What do you think?

jmarshall commented 1 month ago

As I said, we have already worked around the particular instance of this. But I've raised it here so that we or other Hail users won't need to do so in future.

another way to do this might be to ignore/do nothing during the initial call of job.command('') when the argument is empty.

:shrug: I'm not sure that the presence of empty items in BashJob._command is intrinsically invalid, so whether to ignore such calls (by having BashJob.command() do nothing when its argument contains no non-whitespace characters, and verifying that calling command() is the only way to get entries into this list) or to prevent the actual invalid syntax that results later (either by omitting or emitting : or similar) is up to the maintainers.

cjllanwarne commented 1 month ago

@jmarshall it sounded when we caught up in person like you agreed that skipping empty command strings at source (and emitting a warning) during the original job.command call would be the way forward here... were you planning to update this PR with that change?

jmarshall commented 1 month ago

I've updated the PR accordingly — feel free to wordsmith the warning message.

ehigham commented 1 month ago

@jmarshall, thanks doing this - would you mind adding a simple unit test to lock down the behaviour?

jmarshall commented 1 month ago

I am unfamiliar with Hail's test infrastructure so it would be more time efficient for the maintainers to add a test themselves.

ehigham commented 1 month ago

@jmarshall - PR https://github.com/populationgenomics/hail/pull/349 for your consideration

jmarshall commented 1 month ago

Thanks, added with one tweak. Sadly I don't know how to convince your code analyser that using randint to make test cases in test code is not a security issue…

Feel free to push to PR branches directly, or just to add things while merging. You folks are the Hail maintainers after all!

ehigham commented 1 month ago

Thanks, added with one tweak. Sadly I don't know how to convince your code analyser that using randint to make test cases in test code is not a security issue…

Me neither :shrug:

Feel free to push to PR branches directly, or just to add things while merging.

I don't have write access to the populationgenomics fork, hence the PR :)

jmarshall commented 1 month ago

TIL “Allow edits from maintainers” applies only to PRs from personal forks, not organisation forks. Annoying!

cjllanwarne commented 1 month ago

I've authorized sha 600826710afb8dfef5fcbf19f440a43f263ec0ee in CI so we should get a test run through soon.