GoogleCloudPlatform / DataflowTemplates

Cloud Dataflow Google-provided templates for solving in-Cloud data tasks
https://cloud.google.com/dataflow/docs/guides/templates/provided-templates
Apache License 2.0
1.15k stars 971 forks source link

[Feature Request]: Changing dataset name #426

Closed llermaly closed 5 months ago

llermaly commented 2 years ago

Related Template(s)

PubSub to Elasticsearch

What feature(s) are you requesting?

Ability to change dataset name to anything that is not pubsub, audit, firewall

We are ingesting arbitrary logs using pub/sub , but the only field we can manipulate is "namespace", and we would like to use that field for different purposes.

Trying to set the dataset to something different to pubsub, audit, vpn, etc.. will throw error.

We would like to have names like:

logs-gcp.app1-dev logs-gcp.app1-prod logs-gcp.app2-dev ...

Please let me know if this is possible and I'm missing something.

Thanks

llermaly commented 2 years ago

My current workaround is to change the index name in an ingest pipeline but would like to handle that logic in dataflow:

PUT _ingest/pipeline/gcp-pipeline
{
  "processors": [
    {
      "set": {
        "field": "_index",
        "value": "{{_metadata.index}}",
        "ignore_empty_value": true,
        "ignore_failure": true
      }
    }
  ]
}
github-actions[bot] commented 5 months ago

This issue has been marked as stale due to 180 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the issue at any time. Thank you for your contributions.

github-actions[bot] commented 5 months ago

This issue has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.