Closed sotojn closed 6 months ago
Here is the results after running two jobs that utilize the s3_exporter
and the s3_reader
. I ran each with two workers, this is the job file that uses the s3_exporter
:
{
"name": "data-to-s3",
"lifecycle": "persistent",
"workers": 2,
"assets": [
"standard",
"file"
],
"operations": [
{
"_op": "data_generator",
"size": 10000
},
{
"_op": "s3_exporter",
"path": "data-folder-1",
"format": "ldjson"
}
]
}
Here is the metrics that this job produced:
Worker 1
# HELP teraslice_worker_info Information about Teraslice worker
# TYPE teraslice_worker_info gauge
teraslice_worker_info{arch="arm64",clustering_type="kubernetes",name="teraslice",node_version="v18.19.1",platform="linux",teraslice_version="1.6.1",assignment="worker",ex_id="1fb6b686-b1cd-4f01-b7eb-21343099d0aa",job_id="4391d12a-37db-4178-86f0-01a0e3cbb09c",job_name="data-to-s3",pod_name="ts-wkr-data-to-s3-4391d12a-37db-78fb9f4ff6-hzxrz"} 1
# HELP teraslice_worker_slices_processed Number of slices the worker has processed
# TYPE teraslice_worker_slices_processed gauge
teraslice_worker_slices_processed{name="teraslice",assignment="worker",ex_id="1fb6b686-b1cd-4f01-b7eb-21343099d0aa",job_id="4391d12a-37db-4178-86f0-01a0e3cbb09c",job_name="data-to-s3",pod_name="ts-wkr-data-to-s3-4391d12a-37db-78fb9f4ff6-hzxrz"} 43
# HELP teraslice_worker_records_processed_from_s3 Number of records written into s3
# TYPE teraslice_worker_records_processed_from_s3 gauge
teraslice_worker_records_processed_from_s3{class="S3Batcher",name="teraslice",assignment="worker",ex_id="1fb6b686-b1cd-4f01-b7eb-21343099d0aa",job_id="4391d12a-37db-4178-86f0-01a0e3cbb09c",job_name="data-to-s3",pod_name="ts-wkr-data-to-s3-4391d12a-37db-78fb9f4ff6-hzxrz"} 215000
Worker 2:
# HELP teraslice_worker_info Information about Teraslice worker
# TYPE teraslice_worker_info gauge
teraslice_worker_info{arch="arm64",clustering_type="kubernetes",name="teraslice",node_version="v18.19.1",platform="linux",teraslice_version="1.6.1",assignment="worker",ex_id="1fb6b686-b1cd-4f01-b7eb-21343099d0aa",job_id="4391d12a-37db-4178-86f0-01a0e3cbb09c",job_name="data-to-s3",pod_name="ts-wkr-data-to-s3-4391d12a-37db-78fb9f4ff6-knw5l"} 1
# HELP teraslice_worker_slices_processed Number of slices the worker has processed
# TYPE teraslice_worker_slices_processed gauge
teraslice_worker_slices_processed{name="teraslice",assignment="worker",ex_id="1fb6b686-b1cd-4f01-b7eb-21343099d0aa",job_id="4391d12a-37db-4178-86f0-01a0e3cbb09c",job_name="data-to-s3",pod_name="ts-wkr-data-to-s3-4391d12a-37db-78fb9f4ff6-knw5l"} 43
# HELP teraslice_worker_records_processed_from_s3 Number of records written into s3
# TYPE teraslice_worker_records_processed_from_s3 gauge
teraslice_worker_records_processed_from_s3{class="S3Batcher",name="teraslice",assignment="worker",ex_id="1fb6b686-b1cd-4f01-b7eb-21343099d0aa",job_id="4391d12a-37db-4178-86f0-01a0e3cbb09c",job_name="data-to-s3",pod_name="ts-wkr-data-to-s3-4391d12a-37db-78fb9f4ff6-knw5l"} 215000
This wrote 430,000
records into s3.
Job file that uses s3_reader
:
{
"name": "s3-to-es",
"lifecycle": "persistent",
"workers": 2,
"assets": [
"elasticsearch",
"file"
],
"operations": [
{
"_op": "s3_reader",
"path": "data-folder-1",
"size": 10000,
"format": "ldjson"
},
{
"_op": "elasticsearch_bulk",
"size": 10000,
"index": "data-folder-1"
}
]
}
Worker Metrics 1:
# HELP teraslice_worker_info Information about Teraslice worker
# TYPE teraslice_worker_info gauge
teraslice_worker_info{arch="arm64",clustering_type="kubernetes",name="teraslice",node_version="v18.19.1",platform="linux",teraslice_version="1.6.1",assignment="worker",ex_id="19d6660a-79f2-49c9-84a0-fa3b75b5eada",job_id="b1656941-9bd8-4e9d-b551-05c3473e4346",job_name="s3-to-es",pod_name="ts-wkr-s3-to-es-b1656941-9bd8-6b4967c9b8-45tqq"} 1
# HELP teraslice_worker_slices_processed Number of slices the worker has processed
# TYPE teraslice_worker_slices_processed gauge
teraslice_worker_slices_processed{name="teraslice",assignment="worker",ex_id="19d6660a-79f2-49c9-84a0-fa3b75b5eada",job_id="b1656941-9bd8-4e9d-b551-05c3473e4346",job_name="s3-to-es",pod_name="ts-wkr-s3-to-es-b1656941-9bd8-6b4967c9b8-45tqq"} 7918
# HELP teraslice_worker_records_read_from_s3 Number of records read from s3
# TYPE teraslice_worker_records_read_from_s3 gauge
teraslice_worker_records_read_from_s3{class="S3Fetcher",name="teraslice",assignment="worker",ex_id="19d6660a-79f2-49c9-84a0-fa3b75b5eada",job_id="b1656941-9bd8-4e9d-b551-05c3473e4346",job_name="s3-to-es",pod_name="ts-wkr-s3-to-es-b1656941-9bd8-6b4967c9b8-45tqq"} 217367
Worker Metrics 2:
# HELP teraslice_worker_info Information about Teraslice worker
# TYPE teraslice_worker_info gauge
teraslice_worker_info{arch="arm64",clustering_type="kubernetes",name="teraslice",node_version="v18.19.1",platform="linux",teraslice_version="1.6.1",assignment="worker",ex_id="19d6660a-79f2-49c9-84a0-fa3b75b5eada",job_id="b1656941-9bd8-4e9d-b551-05c3473e4346",job_name="s3-to-es",pod_name="ts-wkr-s3-to-es-b1656941-9bd8-6b4967c9b8-5q9lx"} 1
# HELP teraslice_worker_slices_processed Number of slices the worker has processed
# TYPE teraslice_worker_slices_processed gauge
teraslice_worker_slices_processed{name="teraslice",assignment="worker",ex_id="19d6660a-79f2-49c9-84a0-fa3b75b5eada",job_id="b1656941-9bd8-4e9d-b551-05c3473e4346",job_name="s3-to-es",pod_name="ts-wkr-s3-to-es-b1656941-9bd8-6b4967c9b8-5q9lx"} 7737
# HELP teraslice_worker_records_read_from_s3 Number of records read from s3
# TYPE teraslice_worker_records_read_from_s3 gauge
teraslice_worker_records_read_from_s3{class="S3Fetcher",name="teraslice",assignment="worker",ex_id="19d6660a-79f2-49c9-84a0-fa3b75b5eada",job_id="b1656941-9bd8-4e9d-b551-05c3473e4346",job_name="s3-to-es",pod_name="ts-wkr-s3-to-es-b1656941-9bd8-6b4967c9b8-5q9lx"} 212633
This resulted in a new index with a total record count of 430000 records when curling elasticsearch indices:
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open data-folder-1 Z-j6MiSVSbqwpRSE2-obMw 1 1 430000 0 289.3mb 289.3mb
teraslice_worker_records_processed_from_s3
to teraslice_worker_records_written_to_s3
class="S3Batcher"
to op_name
teraslice_worker_records_read_from_s3
We need to bump the asset version a minor version. Then I can merge this.
I have bumped the asset a minor version
This PR makes the following changes:
records_processed_from_s3
prom metric to thes3_exporter
operationrecords_read_from_s3
prom metric to thes3_reader
operation