tahoe-lafs / zfec

zfec -- an efficient, portable erasure coding tool
Other
373 stars 44 forks source link

Benchmarks do not work with Python 3.x #81

Closed sajith closed 8 months ago

sajith commented 1 year ago
$ python --version
Python 3.9.2
$ python bench/bench_zfec.py
  File "/tmp/zfec/bench/bench_zfec.py", line 85
    print "measuring encoding of data with K=%d, M=%d, reporting results in nanoseconds per byte after encoding %d bytes %d times in a row..." % (k, m, SIZE, MAXREPS)
          ^
SyntaxError: invalid syntax
exarkun commented 1 year ago

I have some stuff sitting in my working copy related to this. I'll push it.

sajith commented 1 year ago

(Leaving notes for future reference, although @exarkun already has a handle on this.)

Did a quick check to see if we could we 2to3:

$ 2to3 -w bench/bench_zfec.py 
RefactoringTool: Skipping optional fixer: buffer
RefactoringTool: Skipping optional fixer: idioms
RefactoringTool: Skipping optional fixer: set_literal
RefactoringTool: Skipping optional fixer: ws_comma
RefactoringTool: Refactored bench/bench_zfec.py
--- bench/bench_zfec.py (original)
+++ bench/bench_zfec.py (refactored)
@@ -82,21 +82,21 @@
     # for f in [_encode_file,]:
     # for f in [_encode_file_not_really, _encode_file_not_really_and_hash, _encode_file, _encode_file_and_hash,]:
     # for f in [_encode_data_not_really, _encode_data_easyfec, _encode_data_fec,]:
-    print "measuring encoding of data with K=%d, M=%d, reporting results in nanoseconds per byte after encoding %d bytes %d times in a row..." % (k, m, SIZE, MAXREPS)
+    print("measuring encoding of data with K=%d, M=%d, reporting results in nanoseconds per byte after encoding %d bytes %d times in a row..." % (k, m, SIZE, MAXREPS))
     # for f in [_encode_data_fec, _encode_data_not_really]:
     for f in [_encode_data_fec]:
         def _init_func(size):
             return _make_new_rand_data(size, k, m)
         for BSIZE in [SIZE]:
             results = benchutil.rep_bench(f, n=BSIZE, initfunc=_init_func, MAXREPS=MAXREPS, MAXTIME=None, UNITS_PER_SECOND=1000000000)
-            print "and now represented in MB/s..."
-            print
+            print("and now represented in MB/s...")
+            print()
             best = results['best']
             mean = results['mean']
             worst = results['worst']
-            print "best:  % 4.3f MB/sec" % (10**3 / best)
-            print "mean:  % 4.3f MB/sec" % (10**3 / mean)
-            print "worst: % 4.3f MB/sec" % (10**3 / worst)
+            print("best:  % 4.3f MB/sec" % (10**3 / best))
+            print("mean:  % 4.3f MB/sec" % (10**3 / mean))
+            print("worst: % 4.3f MB/sec" % (10**3 / worst))

 k = K
 m = M
RefactoringTool: Files that were modified:
RefactoringTool: bench/bench_zfec.py

Then it turned out that benchmarks use pyutil, and pyutil is out of date (after pip install .[bench]):

$ python bench/bench_zfec.py 
Traceback (most recent call last):
  File "/tmp/zfec/bench/bench_zfec.py", line 6, in <module>
    from pyutil import benchutil
  File "/tmp/zfec/venv/lib/python3.9/site-packages/pyutil/benchutil.py", line 96, in <module>
    import thread
ModuleNotFoundError: No module named 'thread'
sajith commented 1 year ago

Also there are these benchmark-looking files in the top-level directory:

I am unsure about how to use them: these files have no comments about their usage or purpose, their version history is short (Zooko's initial git commit, which seems to be imported from darcs, simply says "setup: add some scripts to benchmark zfec with different stride lengths"), and README mentions only bench/bench_zfec.py in the context of benchmarks.

Of the above mentioned four, the Python files are not up-to-date:

$ python stridetune-bench.py 
  File "/tmp/zfec/stridetune-bench.py", line 32
    print "stride: %d, results: %d (dup %d)" % (stride, result, results[stride])
          ^
SyntaxError: invalid syntax
$ python stridetune-graph.py 
Traceback (most recent call last):
  File "/tmp/zfec/stridetune-graph.py", line 3, in <module>
    from pyx import *
ModuleNotFoundError: No module named 'pyx'
$ pip install pyx  # pyx is not declared a "bench" requirement in setup.py
[...]
Successfully installed pyx-0.16
$ python stridetune-graph.py 
Traceback (most recent call last):
  File "/tmp/zfec/venv/lib/python3.9/site-packages/pyx/graph/data.py", line 321, in __init__
    filename.readlines
AttributeError: 'str' object has no attribute 'readlines'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/zfec/stridetune-graph.py", line 10, in <module>
    g('stridetune.dat')
  File "/tmp/zfec/stridetune-graph.py", line 7, in g
    g.plot([graph.data.file(f, x=1, y=2)])
  File "/tmp/zfec/venv/lib/python3.9/site-packages/pyx/graph/data.py", line 326, in __init__
    with open(filename) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'stridetune.dat'
itamarst commented 8 months ago

I will try to fix this.