Open zelima opened 6 years ago
the implementation is still a bit flaky, hopefully it will be improved in dpp v2
there are 3 possible problems with your code (or rather, with dpp..):
dpp:streaming: True
on the resourcefixed implementation (haven't tested):
# modified my_flow.py
from dataflows import Flow, add_metadata, dump_to_path, load, printer, update_resources
def flow(parameters, datapackage, resources, stats):
return Flow(
load((datapackage, resources)),
add_metadata(name="finance-vix"),
load(
load_source='http://www.cboe.com/publish/ScheduledTask/MktData/datahouse/vixcurrent.csv',
headers=2
),
update_resource('vixcurrent', **{'dpp:streaming': True}),
dump_to_path()
)
if __name__ == '__main__':
Flow(flow({}, {'resources': []}, [], {}), printer()).process()
This is true for version 1.7.2 Version 2.0.0 has introduced some modification, including:
@zelima does this reproduce in v2.0.0?
@zelima yes, by the time issue was open, this was happening in v2.0.0 as well. Though setting dpp:streaming: True
helped both of them
In order to submit an issue, please ensure you can check the following. Thanks!
I have a piece of code with data flows that works fine if I execute it directly with
python my_flow.py
However, if I wrap it inside
flow()
function and try and run pipelines viadpp run ./my-pipleine
or run inside docker container via dpp server it fails silently, without showing me errors except sayingOutput pipe disappeared!
.my
pipeline-spec.yaml
:my Docekrfile:
Error log on server:
Error log via
dpp run ./finance-vix-flow