Cloud output v2 - Githubissues

Context

https://github.com/grafana/k6/issues/2954 introduces the new experimental Coud output with a Protobuf-based protocol.

Memory usage

After the first iteration, the memory usage is higher than required. Especially for the Trend metrics is very easy to saturate the bandwidth in a range from tons of KiloBytes up to the remote limit (1 MB).

We also decided to denormalize some fields to reduce the workload and keep the implementation simple on the remote server but the load generated on the client is high, we should revisit this decision.

Fault tolerance

The current flush process could be more fault tolerant, it doesn't retry on failures.

Validation

__name__ and test_run_id are reserved labels for the remote service and if a test also sets them then there are conflicts generating unexpected behavior for the user. A more dev-friendly UX should be implemented.

Proposal

We identified some actions that should drive us to the goal:

A more compact Protobuf representation for Histogram.
Split in multiple requests when the flush process gets a number of time series higher than the MaxMetricSamplesPerPackage variable.
Normalize as MetricSet's fields the common fields across time series.
Fault-tolerant flush operation.
Exclude __name__ and test_run_id from the allowed tag names.

Acceptance criteria

Change the Cloud output default version to 2.

Worklog

### Must have
- [ ] https://github.com/grafana/k6/pull/3108
- [ ] https://github.com/grafana/k6/pull/3104
- [ ] https://github.com/grafana/k6/pull/3144
- [ ] https://github.com/grafana/k6/pull/3169
- [ ] https://github.com/grafana/k6/issues/3155
- [ ] https://github.com/grafana/k6/issues/3156
- [ ] https://github.com/grafana/k6/pull/3172
- [ ] https://github.com/grafana/k6/issues/3192
- [ ] https://github.com/grafana/k6/issues/3258 (postponed)
- [ ] https://github.com/grafana/k6/pull/3161

### Nice to have (in case we need to reduce the scope)
- [ ] https://github.com/grafana/k6/pull/3120
- [ ] https://github.com/grafana/k6/pull/3125
- [ ] https://github.com/grafana/k6/pull/3146
- [ ] https://github.com/grafana/k6/pull/3137
- [ ] https://github.com/grafana/k6/issues/3122
- [ ] Revaluate the current periodic and abort signal architecture/interaction (https://github.com/grafana/k6/pull/3082#discussion_r1207875810, https://github.com/grafana/k6/pull/3104#discussion_r1224212064)
- [ ] Unexport all the strucs/methods/fields not required as exported

grafana / k6

Cloud output v2 #3117