RADAR-base / radar-output-restructure

Reads avro files in HDFS and outputs json or csv per topic per user in local file system
Apache License 2.0
1 stars 0 forks source link

Organization bucket #549

Open blootsvoets opened 8 months ago

blootsvoets commented 8 months ago

Proper support for multi-bucket setups.

Current implementation has a default format with

paths:
  target:
     format: null
     default: radar-output-storage

but this can be changed to for example

paths:
  target:
     format: radar-output-${mp:organization}
     default: radar-output-storage
     plugins: mp
     properties:
         managementPortalUrl: http://localhost:8080/managementportal

Then for each organization that has a separate bucket, a new target needs to be created:

targets:
   radar-output-storage:
      type: s3
      path: output
      s3: ...
   radar-output-staging:
      type: s3
      path: /
      s3: ...

The above will output all data to radar-output-storage, except data generated as part of the staging organization which is output to radar-output-staging. The paths.inputs and paths.output have been deprecated in favour of specifying the input and output paths as part of the source and target storage.

TODO:

Fixes #65.

blootsvoets commented 8 months ago

I am assuming the WIP is the TODO list? You want one of us to pick this up?

Indeed. That would be good, yes.