Closed roll closed 4 years ago
@akariv WDYT? Does it make sense?
I have to admit I don't necessarily see the use case here that is somewhere in between passing parameters to a processor and using actual environment variables.
passing common parameters to all processors might be achieved more elegantly by creating 'global parameters' which are then passed to all processors, updating the per-processor parameters (elegance is debatable, of course 😄 ).
e.g
temporal:
title: temporal
description: "temporal format"
parameters:
debug: True
pipeline:
- run: load
parameters:
from: 'temporal.csv'
override_fields:
date:
outputFormat: '%m/%d/%Y'
- run: dump_to_path
parameters:
out-path: 'output'
pretty-descriptor: true
temporal_format_property: outputFormat
Is there any other use case here other than controlling the FD libraries behavior (I'm honestly asking here)?
I don't know other use cases but I think this feature still can be general if we think of something like providing env vars for underlying aws
library or requests
etc
If there are other ways to make it work it should be fine for BCO-DMO. If we could have a custom processor setting env vars it would be enough but I guess it's not possible from a processor
The main goal of this proposal is to make the output of the DPP UI (BCO-DMO are working on) reproducible on CLI. So inside their service, they can set env vars by themselves but outputted DPP specs are going to be run in uncontrolled environments.
Hi @akariv,
sorry I didn't understand it completely. Are you against this change? Could you please elaborate?
In general, I see this as kind logical because env variable managements is available in many similar specs like Travis, Docker Compose etc
Hey @roll - given a 2nd thought, I'm okay with this proposal.
Cool @akariv. Are you happy with this PR - https://github.com/frictionlessdata/datapackage-pipelines/pull/182?
DONE (ready to merge in #182)
Can a release be created for this update? Thanks.
Hi @akariv could you please release?
Thanks @akariv!
Overview
Recently, I've added support for
TABLESCHEMA_PRESERVE_MISSING_VALUES
env var totableschema-py
- https://github.com/frictionlessdata/tableschema-py#experimental - which can be useful for some use cases.I propose we have a standard way to declare environment variables for a pipeline as it implemented for many other declarative formats like
docker-compose.yml
,travis.yml
, etcSo we can have something like this: