Closed dchaley closed 3 weeks ago
Dividing the list of benchmarking columns into the phases:
"input_file_id"
"numpy_size_mb"
"pixels_m"
"compartment"
"benchmark_datetime_utc"
"success"
"deepcell_tf_version"
"instance_type"
"gpu_type"
"num_gpus"
"success"
"peak_memory_gb"
"provisioning_model"
"input_load_time_s"
"time_s"
"output_write_time_s"
"model_load_time_s"
"batch_size"
The resulting BigQuery schema:
input_file_id:STRING,
numpy_size_mb:FLOAT,
pixels_m:INTEGER,
compartment:STRING,
benchmark_datetime_utc:DATETIME,
success:BOOLEAN,
cloud_region:STRING,
preprocessing_instance_type:STRING,
preprocessing_gpu_type:STRING,
preprocessing_num_gpus:INTEGER,
preprocessing_success:BOOLEAN,
preprocessing_peak_memory_gb:FLOAT,
preprocessing_is_preemptible:BOOLEAN,
preprocessing_input_load_time_s:FLOAT,
preprocessing_time_s:FLOAT,
preprocessing_output_write_time_s:FLOAT,
prediction_instance_type:STRING,
prediction_gpu_type:STRING,
prediction_num_gpus:INTEGER,
prediction_success:BOOLEAN,
prediction_peak_memory_gb:FLOAT,
prediction_is_preemptible:BOOLEAN,
prediction_input_load_time_s:FLOAT,
prediction_time_s:FLOAT,
prediction_output_write_time_s:FLOAT,
prediction_model_load_time_s:FLOAT,
prediction_batch_size:INTEGER,
postprocessing_instance_type:STRING,
postprocessing_gpu_type:STRING,
postprocessing_num_gpus:INTEGER,
postprocessing_success:BOOLEAN,
postprocessing_peak_memory_gb:FLOAT,
postprocessing_is_preemptible:BOOLEAN,
postprocessing_input_load_time_s:FLOAT,
postprocessing_time_s:FLOAT,
postprocessing_output_write_time_s:FLOAT
This is done. We ran the steps independently, and gathered the results for upload to BigQuery.
The final schema that we used:
[{"name":"input_file_id","type":"STRING","mode":"NULLABLE"},{"name":"numpy_size_mb","type":"FLOAT","mode":"NULLABLE"},{"name":"pixels_m","type":"INTEGER","mode":"NULLABLE"},{"name":"compartment","type":"STRING","mode":"NULLABLE"},{"name":"benchmark_datetime_utc","type":"DATETIME","mode":"NULLABLE"},{"name":"success","type":"BOOLEAN","mode":"NULLABLE"},{"name":"cloud_region","type":"STRING","mode":"NULLABLE"},{"name":"preprocessing_instance_type","type":"STRING","mode":"NULLABLE"},{"name":"preprocessing_gpu_type","type":"STRING","mode":"NULLABLE"},{"name":"preprocessing_num_gpus","type":"INTEGER","mode":"NULLABLE"},{"name":"preprocessing_success","type":"BOOLEAN","mode":"NULLABLE"},{"name":"preprocessing_peak_memory_gb","type":"FLOAT","mode":"NULLABLE"},{"name":"preprocessing_is_preemptible","type":"BOOLEAN","mode":"NULLABLE"},{"name":"preprocessing_input_load_time_s","type":"FLOAT","mode":"NULLABLE"},{"name":"preprocessing_time_s","type":"FLOAT","mode":"NULLABLE"},{"name":"preprocessing_output_write_time_s","type":"FLOAT","mode":"NULLABLE"},{"name":"prediction_instance_type","type":"STRING","mode":"NULLABLE"},{"name":"prediction_gpu_type","type":"STRING","mode":"NULLABLE"},{"name":"prediction_num_gpus","type":"INTEGER","mode":"NULLABLE"},{"name":"prediction_success","type":"BOOLEAN","mode":"NULLABLE"},{"name":"prediction_peak_memory_gb","type":"FLOAT","mode":"NULLABLE"},{"name":"prediction_is_preemptible","type":"BOOLEAN","mode":"NULLABLE"},{"name":"prediction_input_load_time_s","type":"FLOAT","mode":"NULLABLE"},{"name":"prediction_time_s","type":"FLOAT","mode":"NULLABLE"},{"name":"prediction_output_write_time_s","type":"FLOAT","mode":"NULLABLE"},{"name":"prediction_model_load_time_s","type":"FLOAT","mode":"NULLABLE"},{"name":"prediction_batch_size","type":"INTEGER","mode":"NULLABLE"},{"name":"postprocessing_instance_type","type":"STRING","mode":"NULLABLE"},{"name":"postprocessing_gpu_type","type":"STRING","mode":"NULLABLE"},{"name":"postprocessing_num_gpus","type":"INTEGER","mode":"NULLABLE"},{"name":"postprocessing_success","type":"BOOLEAN","mode":"NULLABLE"},{"name":"postprocessing_peak_memory_gb","type":"FLOAT","mode":"NULLABLE"},{"name":"postprocessing_is_preemptible","type":"BOOLEAN","mode":"NULLABLE"},{"name":"postprocessing_input_load_time_s","type":"FLOAT","mode":"NULLABLE"},{"name":"postprocessing_time_s","type":"FLOAT","mode":"NULLABLE"},{"name":"postprocessing_output_write_time_s","type":"FLOAT","mode":"NULLABLE"}]
cc @WeihaoGe1009
We used to be able to record benchmarking measurements easily because everything was happening in one process.
With the processes split out after #222, the processes are by design independent… so this doesn't work anymore.
Instead: write benchmarking results to Cloud Storage as we go, and add a final step to pull them in & record to the benchmarking table.