Open dadrian opened 6 years ago
I can confirm this is still happening with Apache Beam 2.9.0. My Dataflow job works fine if pushed and run manually with Java/Maven, but if I create a template from the job and run it, I see a similar exception.
I experimented with some JSON configuration options, to no avail. None of these worked. Using jackson-databind 2.9.5.
@JsonFormat(shape = JsonFormat.Shape.ARRAY)
ValueProvider<List<String>> getRepeated();
void setRepeated(ValueProvider<List<String>>
@JsonFormat(with = JsonFormat.Feature.ACCEPT_SINGLE_VALUE_AS_ARRAY)
ValueProvider<List<String>> getRepeated();
void setRepeated(ValueProvider<List<String>>
@JsonFormat(shape = JsonFormat.Shape.ARRAY, with = JsonFormat.Feature.ACCEPT_SINGLE_VALUE_AS_ARRAY)
ValueProvider<List<String>> getRepeated();
void setRepeated(ValueProvider<List<String>>
Seeing this with Beam 2.12 as well.
Is this being tracked as a Jira item?
It is not possible to launch a templated dataflow that accepts an option in the form of
ValueProvider<List<String>>
, despite support for Lists within theValueProvider
class. I'm not sure if this is an SDK or a platform issue.ValueProvider
supportsList<String>
for accepting repeated options, or options in the form of a list.This accepts command-line arguments in the form
--repeated=a,b,c
, and--repeated=a --repeated=b --repeated=c
. Both yield a list of the form["a", "b", "c"]
.However, if I try to launch a template that uses a
ValueProvider<List<String>>
using thegcloud
tool, the value for--repeated
is always encoded as a single-string and I always get a runtime JSON deserialization exception (got string, expected array). If I try manually hitting the API using the Python API client and explicit passing a JSON array, the API kicks back"Invalid value at 'launch_parameters.parameters[0].value' (Map), Cannot have repeated items ('repeated') within a map.
Traceback from command-line launch below: