nedbat / cog

Small bits of Python computation for static files
MIT License
340 stars 26 forks source link

Optimize previous cog output handling #30

Closed gavriil-deshaw closed 6 months ago

gavriil-deshaw commented 6 months ago

~Using StringIO for input and output files in all cases~

Storing the previous code present in the file as a list instead of a string

Fixes #29

gavriil-deshaw commented 6 months ago

FWIW, all tests pass:

$ make test
tox -q
py37: SKIP ⚠ in 0 seconds
...........................................................................................................................................                                                                                   [100%]
139 passed in 0.36s
py38: OK ✔ in 1.53 seconds
py39: SKIP ⚠ in 0 seconds
...........................................................................................................................................                                                                                   [100%]
139 passed in 0.37s
py310: OK ✔ in 1.51 seconds
...........................................................................................................................................                                                                                   [100%]
139 passed in 0.36s
py311: OK ✔ in 1.47 seconds
py312: SKIP ⚠ in 0.12 seconds
Name                        Stmts   Miss Branch BrPart   Cover   Missing
------------------------------------------------------------------------
cogapp/__init__.py              1      0      0      0 100.00%
cogapp/__main__.py              3      3      0      0   0.00%   3-7
cogapp/cogapp.py              487      0    212      0 100.00%
cogapp/makefiles.py            22      0     16      0 100.00%
cogapp/test_cogapp.py         854      3     88      3  99.36%   1300->1302, 2102, 2240-2241
cogapp/test_makefiles.py       68      0     10      0 100.00%
cogapp/test_whiteutils.py      68      0      0      0 100.00%
cogapp/utils.py                37      0      8      0 100.00%
cogapp/whiteutils.py           43      0     34      0 100.00%
------------------------------------------------------------------------
TOTAL                        1583      6    368      3  99.54%
  py37: SKIP (0.00 seconds)
  py38: OK (1.53 seconds)
  py39: SKIP (0.00 seconds)
  py310: OK (1.51 seconds)
  py311: OK (1.47 seconds)
  py312: SKIP (0.12 seconds)
  coverage: OK (0.40 seconds)
  congratulations :) (5.50 seconds)
nedbat commented 6 months ago

Thanks for the pull request! I've never used on cog on such large files so haven't suffered the performance problem.

Looking at your changes, I'm not sure we need to deal with the input and output files differently. I think the speed problems are all due to how previous is managed (string concatenation instead of list appending). Would you mind trying just that change and see if the speed is still good?

gavriil-deshaw commented 6 months ago

Hey Ned, thanks for the quick reply! You're right - the speed is the same only with the change to previous! I'll push an updated commit shortly.

Thanks, Panayiotis

P.S.: Something weird is going on and I'm not sure why... Here are some results for cog -xc: Normal cog:

real    0m34.789s
user    0m14.104s
sys     0m20.461s

Cog with io.StringIO() as fIn:

real    0m9.625s
user    0m9.039s
sys     0m0.524s

Cog with io.StringIO() as fIn and previous as list:

real    0m0.172s
user    0m0.123s
sys     0m0.021s

Cog with previous as list:

real    0m0.136s
user    0m0.108s
sys     0m0.021s

I ran each one multiple times and got roughly the same results. It is not clear to me why if previous is a string then the handling of fIn matters, but if previous is a list then it doesn't. Just out of curiosity, can you think of any reasons?

nedbat commented 6 months ago

Hmm, that is odd, I can't think why a stringIO for reading would have an effect at all, but I like the simplicity of the latest code, and it gets us the full benefit!

gavriil-deshaw commented 6 months ago

Exactly! Thanks for merging :))