Closed Ichoran closed 8 years ago
Saving a 75 MB wcon file took 20+ minutes on my quad core machine, which is unacceptably slow, I agree.
I will start by profiling _data_segment_as_odict
, which is the method that is taking up all the time. I suspect it's because I'm not pre-allocating memory for jagged_array
.
mpstat -P ALL 2
while running WCON test.py
that only one core gets used. So renting a 32-core instance on AWS is definitely a waste of dollars...line_profiler
to discover that the .loc
DataFrame views in _data_segment_as_odict
were creating copies of tables rather than views:File: ../../tracker-commons/src/Python/wcon/wcon_data.py
Function: _data_segment_as_odict at line 613
Line # Hits Time Per Hit % Time Line Contents
==============================================================
613 @profile
614 def _data_segment_as_odict(worm_id, df_segment):
615 """
616 Convert a pandas dataframe into an ordered_dictionary. This is a support
617 method of data_as_array.
618
619 """
620 1 54 54.0 0.0 print("entering _data_segment_as_odict")
621 1 4 4.0 0.0 start_time = time.monotonic()
622 1 4 4.0 0.0 gc.disable()
623 1 2 2.0 0.0 data_segment = [("id", worm_id)]
624
625 1 2 2.0 0.0 print("TIMEPOINT 1: %.2f seconds" %
626 1 26 26.0 0.0 (time.monotonic() - start_time))
627 1 3 3.0 0.0 start_time = time.monotonic()
628
629
630 # We only care about time indices for which we have some "aspect" or
631 # else which have some non-aspect
632 1 3361 3361.0 0.0 df_segment = df_segment[(~df_segment.isnull()).any(axis=1)]
633
634 # We must make the array "jagged" according to the "aspect size",
635 # since aspect size may differ frame-by-frame
636 1 993 993.0 0.0 worm_aspect_size = df_segment.loc[:, ('aspect_size')]
637
638 1 28 28.0 0.0 data_segment.append(('t', list(df_segment.index)))
639
640 1 350 350.0 0.0 keys_used = [k for k in df_segment.columns.get_level_values('key')
641 if k != 'aspect_size']
642
643 1 3 3.0 0.0 print("TIMEPOINT 2: %.2f seconds" %
644 1 36 36.0 0.0 (time.monotonic() - start_time))
645 1 3 3.0 0.0 start_time = time.monotonic()
646
647 # e.g. ox, oy, head, ventral
648 1 152 152.0 0.0 for key in [k for k in keys_used if k in elements_without_aspect]:
649 cur_segment_slice = df_segment.loc[:, idx[key, 0]]
650 # We must replace NaN with None, otherwise the JSON encoder will
651 # save 'NaN' as the string and this will get rejected by our schema
652 # on any subsequent loads
653 # Note we can't use .fillna(None) due to this issue:
654 # https://github.com/pydata/pandas/issues/1972
655 if key in ['head', 'ventral']:
656 cur_segment_slice = \
657 cur_segment_slice.where(pd.notnull(cur_segment_slice), None)
658 cur_list = list(np.array(cur_segment_slice))
659 data_segment.append((key, cur_list))
660
661 1 2 2.0 0.0 print("TIMEPOINT 3: %.2f seconds" %
662 1 28 28.0 0.0 (time.monotonic() - start_time))
663 1 3 3.0 0.0 start_time = time.monotonic()
664
665 1 1 1.0 0.0 accumulated_time0 = 0
666 1 1 1.0 0.0 accumulated_time1 = 0
667 1 1 1.0 0.0 accumulated_time2 = 0
668 1 2 2.0 0.0 accumulated_time3 = 0
669
670 # e.g. x, y
671 503 1104 2.2 0.0 for key in [k for k in keys_used if k in elements_with_aspect]:
672 502 1119 2.2 0.0 st = time.monotonic()
673 502 433404 863.4 0.3 non_jagged_array = df_segment.loc[:, (key)]
674 502 2150 4.3 0.0 accumulated_time0 += time.monotonic() - st
675
676 502 998 2.0 0.0 jagged_array = []
677
678 101404 248991 2.5 0.2 for t in non_jagged_array.index:
679 # If aspect size isn't defined, don't bother adding data here:
680 100902 45335029 449.3 34.0 if np.isnan(worm_aspect_size.loc[t, 0]):
681 jagged_array.append([])
682 else:
683 100902 365646 3.6 0.3 st = time.monotonic()
684
685 100902 44400822 440.0 33.3 cur_aspect_size = int(worm_aspect_size.loc[t, 0])
686 100902 403096 4.0 0.3 accumulated_time1 += time.monotonic() - st
687 # For some reason loc's slice notation is INCLUSIVE!
688 # so we must subtract one from cur_aspect_size, so if
689 # it's 3, for instance, we get only entries
690 # 0, 1, and 2, as required.
691 100902 198099 2.0 0.1 if cur_aspect_size == 0:
692 27108 53826 2.0 0.0 cur_entry = []
693 else:
694 73794 154050 2.1 0.1 st = time.monotonic()
695 73794 36917435 500.3 27.7 cur_entry = non_jagged_array.loc[t, 0:cur_aspect_size - 1]
696 73794 297507 4.0 0.2 accumulated_time2 += time.monotonic() - st
697 73794 163166 2.2 0.1 st = time.monotonic()
698 73794 3883578 52.6 2.9 cur_entry = list(np.array(cur_entry))
699 73794 280115 3.8 0.2 accumulated_time3 += time.monotonic() - st
700
701
702 100902 248648 2.5 0.2 jagged_array.append(cur_entry)
703
704 502 1248 2.5 0.0 data_segment.append((key, jagged_array))
705
706 1 2 2.0 0.0 print("TIMEPOINT 4: %.2f seconds" %
707 1 75 75.0 0.0 (time.monotonic() - start_time))
708 1 3 3.0 0.0 start_time = time.monotonic()
709
710 1 2 2.0 0.0 print("accumulatedtime0: %.2f seconds" %
711 1 22 22.0 0.0 (accumulated_time0))
712
713 1 2 2.0 0.0 print("accumulatedtime1: %.2f seconds" %
714 1 18 18.0 0.0 (accumulated_time1))
715
716 1 2 2.0 0.0 print("accumulatedtime2: %.2f seconds" %
717 1 15 15.0 0.0 (accumulated_time2))
718
719 1 2 2.0 0.0 print("accumulatedtime3: %.2f seconds" %
720 1 15 15.0 0.0 (accumulated_time3))
721
722 1 7 7.0 0.0 gc.enable()
723
724 1 2604 2604.0 0.0 return OrderedDict(data_segment)
Full 4500+ frame file with garbage collection 1744.49 seconds = ~29 minutes Full 4500+ frame file with garbage collection disabled during crucial parts: 1546.59 seconds = ~26 minutes
I profiled with just the first 200 frames so I didn't have to wait so long:
worm_aspect_size.loc[t, 0]
to be outside the main t loop: 27.84 secondsjagged array = non_jagged_array.values.tolist()
then slicing it down, rather than iterating through each time frame, slicing the dataframe, converting it to a list, an appending it to the jagged array: 1.60 secondsThese are in commit 16b1bd4126e165fe953c47f70c4555746e7d47a8.
Now profiling the full file is more feasible.
Full file with the above performance improvements to _data_segment_as_odict
: 38.53 seconds (a 40x speedup)
New profile:
Total time: 35.8752 s
File: ../../tracker-commons/src/Python/wcon/wcon_data.py
Function: _data_segment_as_odict at line 612
Line # Hits Time Per Hit % Time Line Contents
==============================================================
612 @profile
613 def _data_segment_as_odict(worm_id, df_segment):
614 """
615 Convert a pandas dataframe into an ordered_dictionary. This is a support
616 method of data_as_array.
617
618 """
619 # Disable garbage collection to speed up list append operations
620 1 11 11.0 0.0 gc.disable()
621 1 2 2.0 0.0 data_segment = [("id", worm_id)]
622
623 # We only care about time indices for which we have some "aspect" or
624 # else which have some non-aspect
625 1 42170 42170.0 0.1 df_segment = df_segment[(~df_segment.isnull()).any(axis=1)]
626
627 # We must make the array "jagged" according to the "aspect size",
628 # since aspect size may differ frame-by-frame
629 1 1164 1164.0 0.0 worm_aspect_size = df_segment.loc[:, ('aspect_size')]
630
631 1 255 255.0 0.0 data_segment.append(('t', list(df_segment.index)))
632
633 1 381 381.0 0.0 keys_used = [k for k in df_segment.columns.get_level_values('key')
634 if k != 'aspect_size']
635
636 # e.g. ox, oy, head, ventral
637 1 139 139.0 0.0 for key in [k for k in keys_used if k in elements_without_aspect]:
638 cur_segment_slice = df_segment.loc[:, idx[key, 0]]
639 # We must replace NaN with None, otherwise the JSON encoder will
640 # save 'NaN' as the string and this will get rejected by our schema
641 # on any subsequent loads
642 # Note we can't use .fillna(None) due to this issue:
643 # https://github.com/pydata/pandas/issues/1972
644 if key in ['head', 'ventral']:
645 cur_segment_slice = \
646 cur_segment_slice.where(pd.notnull(cur_segment_slice), None)
647 cur_list = list(np.array(cur_segment_slice))
648 data_segment.append((key, cur_list))
649
650 1 993 993.0 0.0 num_spine_points = worm_aspect_size.loc[:, 0].astype(int).tolist()
651
652 # e.g. x, y
653 503 1376 2.7 0.0 for key in [k for k in keys_used if k in elements_with_aspect]:
654 502 602137 1199.5 1.7 non_jagged_array = df_segment.loc[:, (key)]
655
656 502 19032958 37914.3 53.1 jagged_array = non_jagged_array.values.tolist()
657
658 502 5012 10.0 0.0 num_time_points = len(jagged_array)
659
660 2330786 3514033 1.5 9.8 for i in range(num_time_points):
661 2330284 3610922 1.5 10.1 cur_aspect_size = num_spine_points[i]
662
663 2330284 9060164 3.9 25.3 jagged_array[i] = jagged_array[i][:cur_aspect_size]
664
665 502 3190 6.4 0.0 data_segment.append((key, jagged_array))
666
667 # Re-enable garbage collection now that all our list append
668 # operations are done.
669 1 15 15.0 0.0 gc.enable()
670
671 1 234 234.0 0.0 return OrderedDict(data_segment)
Next steps:
So, reading the whole 4500+ frame file takes 73.81 seconds.
12% of the time is spent in _validate_time_series_data
88% in _obtain_time_series_data_frame
Zooming in on _obtain_time_series_data_frame
, sadly nothing is obviously capable of optimization. (see below)
Instead, I think adding multicore support is probably the best approach.
Line # Hits Time Per Hit % Time Line Contents
==============================================================
256 @profile
257 def parse_data(data):
258 """
259 Parse the an array of entries conforming to the WCON schema definition
260 for the "data" array in the root object. The canonical example is
261 that of worm "skeleton" (midline) information over time.
262
263 This could be the standard "data" array from the root object, or
264 some custom array that needs to be processed
265
266 Note that all elements are required to have "id", "t", and "x" and "y"
267 entries.
268
269 Return an object encapsulating that data.
270
271 """
272 # If data is single-valued, wrap it in a list so it will be just
273 # a special case of the array case.
274 2 7 3.5 0.0 if isinstance(data, dict):
275 1 1 1.0 0.0 data = [data]
276
277 2 41 20.5 0.0 data = np.array(data)
278
279 # We only care about data that have all the mandatory fields.
280 # the Custom Feature Type 2 objects are suppresed by this filter:
281 2 111 55.5 0.0 is_time_series_mask = get_mask(data, 't')
282 2 37 18.5 0.0 has_id_mask = get_mask(data, 'id')
283 2 60 30.0 0.0 data = data[is_time_series_mask & has_id_mask]
284
285 # Clean up and validate all time-series data segments
286 2 86335 43167.5 12.0 _validate_time_series_data(data)
287
288 2 631829 315914.5 87.9 time_df = _obtain_time_series_data_frame(data)
289
290 2 5 2.5 0.0 return time_df
ubuntu@ip-172-31-9-94:~/github/open-worm-analysis-toolbox/examples$ kernprof -l WCON\ demo.py
RUNNING TEST WCON demo.py:
Time elapsed: 38.36 seconds
NOW, let's LOAD this file!
Loading file: testfile2.wcon
Time elapsed: 73.18 seconds
Wrote profile results to WCON demo.py.lprof
ubuntu@ip-172-31-9-94:~/github/open-worm-analysis-toolbox/examples$ python -m line_profiler WCON\ demo.py.lprof
Timer unit: 1e-06 s
Total time: 0.74199 s
File: ../../tracker-commons/src/Python/wcon/wcon_data.py
Function: _obtain_time_series_data_frame at line 291
Line # Hits Time Per Hit % Time Line Contents
==============================================================
291 @profile
292 def _obtain_time_series_data_frame(time_series_data):
293 """
294 Obtain a time-series pandas DataFrame
295
296 Parameters
297 ------------
298 time_series_data: list
299 All entries must be lists of dicts, all of which
300 must have 'id' and 't'
301
302 Returns
303 ----------
304 pandas dataframe
305 Time-series data goes into this Pandas DataFrame
306 The dataframe will have t as index, and multilevel columns
307 with id at the first level and all other keys at second level.
308
309 """
310
311 # Our DataFrame to return
312 2 9 4.5 0.0 time_df = None
313
314 # Consider only time-series data stamped with an id:
315 4 31 7.8 0.0 for data_segment in time_series_data:
316 # Add this data_segment to a Pandas dataframe
317 2 6 3.0 0.0 segment_id = data_segment['id']
318 2 44 22.0 0.0 segment_keys = np.array([k for k in data_segment.keys()
319 if k not in ['t', 'id']])
320
321 2 827 413.5 0.1 cur_timeframes = np.array(data_segment['t']).astype(float)
322
323 # Create our column names as the cartesian product of
324 # the segment's keys and the id of the segment
325 # Only elements articulated across the skeleton, etc, of the worm
326 # have the "aspect" key though
327 cur_elements_with_aspect = \
328 2 46 23.0 0.0 [k for k in elements_with_aspect if k in segment_keys]
329 2 5 2.5 0.0 cur_elements_without_aspect = ['aspect_size'] + \
330 2 47 23.5 0.0 [k for k in elements_without_aspect if k in segment_keys]
331
332 # We want to be able to fit the largest aspect size in our
333 # DataFrame
334 2 4575 2287.5 0.6 max_aspect_size = int(max([k[0] for k in data_segment['aspect_size']]))
335
336 2 34 17.0 0.0 key_combos = list(itertools.product([segment_id],
337 2 5 2.5 0.0 cur_elements_with_aspect,
338 2 101 50.5 0.0 range(max_aspect_size)))
339 2 8 4.0 0.0 key_combos.extend(list(itertools.product([segment_id],
340 2 4 2.0 0.0 cur_elements_without_aspect,
341 2 10 5.0 0.0 [0])))
342
343 2 5 2.5 0.0 column_type_names = ['id', 'key', 'aspect']
344 2 47 23.5 0.0 cur_columns = pd.MultiIndex.from_tuples(key_combos,
345 2 5968 2984.0 0.8 names=column_type_names)
346
347 # e.g. if this segment has only 'x', 'y', that's what we'll be
348 # looking to add to our dataframe data staging in the next step
349 2 6 3.0 0.0 cur_data_keys = cur_elements_with_aspect + \
350 2 6 3.0 0.0 cur_elements_without_aspect
351
352 # We must pad the timeframes where the data doesn't have maximal
353 # aspect or else the concatenation step below will fail.
354 6 18 3.0 0.0 for k in cur_elements_with_aspect:
355 18572 47083 2.5 6.3 for i in range(len(cur_timeframes)):
356 data_segment[k][i] = (
357 18568 46742 2.5 6.3 data_segment[k][i] +
358 18568 128965 6.9 17.4 [np.NaN] * (max_aspect_size - len(data_segment[k][i])))
359
360 # Stage the data for addition to our DataFrame.
361 # Shape KxI where K is the number of keys and
362 # I is the number of "aspects"
363 2 8 4.0 0.0 cur_data = np.array(
364 2 11 5.5 0.0 [np.concatenate(
365 [data_segment[k][i] for k in cur_data_keys]
366 2 347316 173658.0 46.8 ) for i in range(len(cur_timeframes))]
367 )
368
369 2 2274 1137.0 0.3 cur_df = pd.DataFrame(cur_data, columns=cur_columns)
370
371 2 358 179.0 0.0 cur_df.index = cur_timeframes
372 2 20 10.0 0.0 cur_df.index.names = 't'
373
374 # We want the index (time) to be in order.
375 2 358 179.0 0.0 cur_df.sort_index(axis=0, inplace=True)
376
377 # Apparently multiindex must be sorted to work properly:
378 2 67314 33657.0 9.1 cur_df.sort_index(axis=1, inplace=True)
379
380 2 22 11.0 0.0 if time_df is None:
381 2 5 2.5 0.0 time_df = cur_df
382 else:
383 time_df = df_upsert(src=cur_df, dest=time_df)
384
385 # We want the index (time) to be in order.
386 2 80 40.0 0.0 time_df.sort_index(axis=0, inplace=True)
387
388 # We have to do this because somehow with entire worms who have 'head' or
389 # 'ventral' columns, all their columns (even the numeric ones)
390 # have dtype = object ! We want dtype=float64 (or some float anyway)
391 # (Ignore the FutureWarning here about convert_objects being deprecated,
392 # because to_numeric only acts on Series and I don't know which ones
393 # we need to convert.)
394 2 201 100.5 0.0 with warnings.catch_warnings():
395 2 232 116.0 0.0 warnings.filterwarnings(action="ignore", category=FutureWarning)
396 2 53569 26784.5 7.2 time_df = time_df.convert_objects(convert_numeric=True)
397
398 # If 'head' or 'ventral' is NaN, we must specify '?' since
399 # otherwise, when saving this object, to specify "no value" we would
400 # have to create a new data segment with no 'head' column at all.
401 # But since both 'head' and 'ventral' have this issue, we might need
402 # segments with just 'head', just 'ventral', and both 'head' and
403 # 'ventral', and none. That's too unwieldy for a rare use case.
404 # So instead we treat nan as '?'.
405
406 # We must replace NaN with None, otherwise the JSON encoder will
407 # save 'NaN' as the string and this will get rejected by our schema
408 # on any subsequent loads
409 # Note we can't use .fillna(None) due to this issue:
410 # https://github.com/pydata/pandas/issues/1972
411 2 521 260.5 0.1 df_keys = set(time_df.columns.get_level_values('key'))
412 6 17 2.8 0.0 for k in ['head', 'ventral']:
413 4 10 2.5 0.0 if k in df_keys:
414 cur_slice = time_df.loc[:, idx[:, k, :]]
415 time_df.loc[:, idx[:, k, :]] = cur_slice.fillna(value=np.nan)
416
417 # Make sure aspect_size is a float, since only floats are nullable:
418 2 5 2.5 0.0 if 'aspect_size' in df_keys:
419 time_df.loc[:, idx[:, 'aspect_size', :]] = \
420 2 35071 17535.5 4.7 time_df.loc[:, idx[:, 'aspect_size', :]].astype(float)
421
422 2 6 3.0 0.0 return time_df
Is that the best way to use np.concatenate? What is its performance on sequential additions? I'd think you'd instead be after some sort of bulk operation. Also, the NaN-padding is surprisingly expensive. Any idea what fraction of values are NaN in the test file?
@Ichoran : not sure. I'm trying a few other things first; thanks for your comments...
After playing with the multiprocessing
module for a few days, it turned out I missed something obvious from the above profiler log:
Time elapsed: 73.18 seconds
Wrote profile results to WCON demo.py.lprof
ubuntu@ip-172-31-9-94:~/github/open-worm-analysis-toolbox/examples$ python -m line_profiler WCON\ demo.py.lprof
Total time: 0.74199 s
...
The function I was profiling actually took just 0.74 seconds out of the 73.18 seconds the whole process was taking. I was optimizing the wrong method.
As it turns out, after further profiling, the culprit, taking 99% of the time, was a call to the jsonschema.validate
method. I'm going to create a replicable example and post it to the jsonschema
repo. Validating against the schema is really an optional thing; if we are sure a file conforms then it is an unnecessary step.
I sped up the saving code again (commit cf8051d6fa2db06e921fa8fe3a8bbe9b9078bb56)
It turns out the line
keys_used = [k for k in df_segment.columns.get_level_values('key')
if k != 'aspect_size']
... was not returning unique results, so while keys_used = ['x', 'y']
was what I wanted to be the typical result of that line, it was actually ending up with keys_used = ['x', 'x', 'x', 'x', 'x', 'y', 'y', 'y', 'y', 'y']
or something, so the later filtering step was being looped over more times than necessary.
Now the saving the full example file takes 5.38 seconds (an improvement on the previous 38.53 seconds, and now a 324x speedup from the original 29 minutes.)
That's 5.38 seconds for the full 75 MB? That's reasonable! Good work getting it back into the realm of usability!
Python performance summary
File used for test: tracker-commons/tests/from_schafer_lab.wcon
Size: 59.3 MB (4641 frames, ~ 250 points x and 250 points y per frame)
Zipped size: 5.57 MB
Instruction | Time (seconds) | jsonschema.validate used? |
Cores used |
---|---|---|---|
WCONWorms.load_from_file | 1.00 | No | 1 |
WCONWorms.load_from_file | 74.29 | Yes | 1 |
WCONWorms.save_to_file | 5.38 | n/a | 1 |
Information on computer used:
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 62
Stepping: 4
CPU MHz: 2800.062
BogoMIPS: 5600.12
Hypervisor vendor: Xen
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 25600K
NUMA node0 CPU(s): 0-3
$ uname -a
Linux ip-172-31-9-94 3.13.0-79-generic #123-Ubuntu SMP Fri Feb 19 14:27:58 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
$ python
>>> import sys
>>> sys.version
'3.5.1 |Continuum Analytics, Inc.| (default, Dec 7 2015, 11:16:01) \n[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]'
Benchmark script (imports omitted):
JSON_path = '~/github/tracker-commons/tests/from_schaefer_lab'
start_time = time.monotonic()
w1 = WCONWorms.load_from_file(JSON_path,
validate_against_schema=False)
print("Time to load w1: " + str(time.monotonic() - start_time))
test_path = 'test.wcon'
start_time = time.monotonic()
w1.save_to_file(test_path, pretty_print=pretty)
print("Time to save w1: " + str(time.monotonic() - start_time))
Yikes. Can we turn validation off by default, or will that result in badly formed WCON files turning into badly formed data in memory without any good way to tell?
I agree. Validation can be turned off - although I have the default to True
if that's okay. I'm not sure what happens if we pass invalid WCON. It still has to be valid JSON, that much is still checked.
Can you maybe check what happens with valid JSON that is invalid WCON before you close this?
I'm really rather wary of providing something that takes 10x longer than it needs to, or of providing something that doesn't actually check that the WCON is valid. Some sort of very clear warning is needed in either case. This tends to be the kind of thing that is not obvious to a casual user and the first experience shouldn't be, "Well, I did all this work to switch to WCON and now it takes me 15x longer than with my format...no thanks!"
I still haven't checked what happens with some invalid WCON files, but I did create this issue asking the jsonschema
authors why validate
is so slow:
https://github.com/Julian/jsonschema/issues/277
We can leave this issue open until I can speed up the validation step or I think more carefully about in what situations not validating might cause an issue.
@Ichoran would you mind please adding the XJ30_NaCl500mM4uL6h_10m45x10s40s_Ea.wcon
file you mentioned in your opening comment on this issue to the shared Google Drive data folder? That way it can be pulled down for automatic Travis CI testing.
Thanks
It's already on the drive. Feel free to move it around as makes sense for you.
Great thanks. I've now moved it from "example_data/MWT Example Data" to "example_data/WCON" just because that's where the other files are.
In incidental testing with files based upon
XJ30_NaCl500mM4uL6h_10m45x10s40s_Ea.wcon
, a 75 MB wcon file, the Scala and Julia implementations complete in a few seconds, while the Python completes in a few minutes (~60-90x slower). This poses barriers for the analysis of large data sets.If there are easy ways to speed up this performance, it is probably advisable. If not, it may be worth investigating an alternate implementation for large data sets that wraps one of the other implementations (e.g. the yet-to-be-created C or Rust implementations).