Closed bogdang989 closed 7 years ago
I've been testing bcbio CWL runs with the latest bunny 1.0.0-rc2 and see similar problems. Anything that is meant for a scatter, including the initial input JSON gets passed as a single array of all objects. So if you have an input like:
{
"description": [
"Test1",
"Test2"
]
}
instead of running two scattered jobs with description: Test1
and description: Test2
as inputs, you get one job with description: [Test1, Test2]
. Thanks for all the work on getting bunny up to date with CWL 1.0, excited to have this running with bcbio generated CWL.
@chapmanb It's probably unrelated issue. @stefanristeski did some work testing bcbio on bunny, and he identified some bunny issues, but also instances where bcbio script doesn't produce valid cwl1. For scatter specifically, there was an issue that bcbio declares scatter as step_id/input_id
but for v1, it should only be input_id
.
@bogdang989 I've fixed it. This is the patch 069c1e8233ab16c2ac6cb76c2970afe4d05b1a75
Can you please verify if it's working. Thanks
@simonovic86 It works! Thanks a lot
Luka -- thanks much for the tip, I hadn't realized the specification for this changed. It's so useful to have a separate implementation to help shake out these issues.
If I swap that over it does try to scatter but I immediately get an error:
org.rabix.engine.processor.handler.EventHandlerException: Port config__algorithm__align_split_size for root.alignment.1.prep_align_inputs and rootId 3f628f13-951b-41ed-8f06-0afc11ef1314 is not a list and therefore cannot be scattered.
The input is a list:
"config__algorithm__align_split_size": [
25000,
25000
],
and I think is specified right:
inputs:
- id: config__algorithm__align_split_size
type:
items: long
type: array
[...]
scatter:
- config__algorithm__align_split_size
Am I doing something obviously wrong here as well? I can also put together a repo with all of the test data to reproduce.
Thanks again for the help with this.
Can you attach entire workflow so we could examine? Did you try specifying type as just long
? As specified n you comment, I would expect that input should be a list of lists of longs, so when scattered, each invocation gets a list of longs.
@chapmanb, @stefanristeski I've opened another github issue #94 to track bcbio support.
If a node is scattered and one job is created, output is a single object instead of an array of objects.