Closed thiagokokada closed 4 years ago
Using the script from @bfontaine from PR #64 (just one run because I am lazy):
This branch without indent:
1203203 function calls (1024403 primitive calls) in 0.525 seconds
This branch with indent:
1203203 function calls (1024403 primitive calls) in 0.524 seconds
Master:
1119203 function calls (1005203 primitive calls) in 0.456 seconds
So yeah, a slightly overhead, however it does not seem that bad. WDYT @swaroopch ?
Awesome! 🙌
FYI I generated 100,000 random EDN structures which I dumped with indent=2
then re-loaded to check nothing was lost in the roundtrip, and I haven’t had any issue like I had with the previous implementation. 👌
So yeah, a slightly overhead, however it does not seem that bad. WDYT @swaroopch ?
@thiagokokada Will follow up this week.
@thiagokokada This is a really nice implementation! :+1: for writing the tests.
Can we please add docstrings to the functions indent_lines
, udump
, dump
that describes the parameters? For example, I had to read the PR twice to understand what the difference between indent
and indent_step
:-)
Out of curiosity, do you think indent_lines
would be faster by using https://docs.python.org/3/library/io.html#io.StringIO vs. string concatenation?
Thank you!
1) Sure, will do :+1:
2) No, I don't think so. Creating a array of strings and joining them should be really efficient, probably even more than StringIO
(the older implementation used string concatenation, that yeah, it is kind slow): https://stackoverflow.com/q/4733693/2751730
@thiagokokada
One minor optimization that may help would be to store indent_step * ' '
in a variable instead of re-computing it for every line. It might be worth trying with StringIO
/cStringIO
just to check.
I applied the small optimization from @bfontaine anyway and add the docstrings asked by @swaroopch.
Now about the StringIO
. I am not going to run exhaustive tests, however this is what I got with StringIO
:
1203203 function calls (1024403 primitive calls) in 0.510 seconds
There really doesn't seem to have much difference. Actually even using string concat (that should be slower) there isn't much difference in performance, at least using the benchmark from @bfontaine.
I think the current code is more idiomatic Python too and it also avoids an import, so I prefer it as current it is. WDYT?
Thank you @thiagokokada !
This adds a pretty printer similar to
json.dumps(indent=<int>)
. However, it does not follow Clojure formatting guidelines, instead formatting in a way more common to users from other languages like Python.So it will convert this:
To this:
Instead of this:
This should be already better than the current status quo (that is, no pretty printer at all).
Should fix issue https://github.com/swaroopch/edn_format/issues/39 (unless the author of the issue wanted a more Clojure-like pretty printer).
Alternative implementation of PR https://github.com/swaroopch/edn_format/pull/64. It fixes all issues found by @bfontaine, and also this approach is simpler. However, different of the older approach this one also brings change in the non-indent flow, and it may be slower (however I think the difference will be insignificant).