caesar0301 / treelib

An efficient implementation of tree data structure in python 2/3.
http://treelib.readthedocs.io/en/latest/
Other
801 stars 185 forks source link

Please, remove the last new line in tree.show() #139

Open voidloop opened 4 years ago

voidloop commented 4 years ago

Why is there a new line after the tree output?

In the following example the new line is after Mark:

>>> from treelib import Node, Tree
>>> tree = Tree()
>>> tree.create_node("Harry", "harry")  # root node
Node(tag=Harry, identifier=harry, data=None)
>>> tree.create_node("Jane", "jane", parent="harry")
Node(tag=Jane, identifier=jane, data=None)
>>> tree.create_node("Bill", "bill", parent="harry")
Node(tag=Bill, identifier=bill, data=None)
>>> tree.create_node("Diane", "diane", parent="jane")
Node(tag=Diane, identifier=diane, data=None)
>>> tree.create_node("Mary", "mary", parent="diane")
Node(tag=Mary, identifier=mary, data=None)
>>> tree.create_node("Mark", "mark", parent="jane")
Node(tag=Mark, identifier=mark, data=None)
>>> tree.show()
Harry
├── Bill
└── Jane
    ├── Diane
    │   └── Mary
    └── Mark

>>> 

Could you remove this new line from the output leaving the programmer the possibility to add it?

Thank you.

ben-e-whitney commented 4 years ago

Here is the [body][body] of Tree.show:

        self._reader = ""

        def write(line):
            self._reader += line.decode('utf-8') + "\n"

        try:
            self.__print_backend(nid, level, idhidden, filter,
                                 key, reverse, line_type, data_property, func=write)
        except NodeIDAbsentError:
            print('Tree is empty')

        print(self._reader)

Both write and print add a newline to their arguments, so you get an extra (it would seem). I can make a pull request if this isn't the expected behavior.

[body]: https://github.com/caesar0301/treelib/blob/a78d08a79efe33168122b7668ba6c5e3867c1fd2/treelib/tree.py#L761-L772

leonardbinet commented 4 years ago

I totally agree @voidloop, this shouldn't be printed. In order not to break retro-compatibility if some people rely on this (even though it would be weird IMO), I've added the stdout parameter: https://github.com/caesar0301/treelib/blob/dev/treelib/tree.py#L782 To remove print you'll have to set stdout=False.

This is currently on dev branch, which will be merged in next release https://github.com/caesar0301/treelib/issues/128

ben-e-whitney commented 4 years ago

@leonardbinet This solution seems a little strange to me, since

  1. the output is sent to stdout whether the new parameter is True or False,
  2. the docstring says the function returns None, which is not always the case, and
  3. the return type of the function depends on its parameters (stdout), which seems perhaps inadvisable.

What do you think about something like this instead?

def show(self, nid=None, level=ROOT, idhidden=True, filter=None,
         key=None, reverse=False, line_type='ascii-ex', data_property=None,
         end='\n\n'):
    #Docstring here.
    self._reader = []

    def write(line):
        self._reader.append(line.decode('utf-8'))

    try:
        self.__print_backend(nid, level, idhidden, filter,
                             key, reverse, line_type, data_property, func=write)
    except NodeIDAbsentError:
        print('Tree is empty')

    print('\n'.join(self._reader), end=end)

Advantages:

  1. The function always returns the same type (NoneType).
  2. The name of the keyword argument end matches the one passed to print.
  3. The final string is constructed with one call to str.join rather than many string concatentations.

Expanding on the last point, this little test suggests that str.join is faster. I have not checked that this pattern holds with different line lengths or numbers of lines.

import itertools
import string
import sys
import timeit

newline: str = '\n'
N: int = 100

time = timeit.timeit(
'''
output = ''
for _ in range(N):
    output += string.ascii_letters + newline
print(output, file=sys.stderr)
''',
'''
import string
from __main__ import newline, N
'''
)
print(f'concatenation: {time:.1e} s')

time = timeit.timeit(
'''
lines = tuple(itertools.repeat(string.ascii_letters, N))
print(newline.join(lines), file=sys.stderr)
''',
'''
import itertools
import string
from __main__ import newline, N
'''
)
print(f'`str.join`:    {time:.1e} s')

time = timeit.timeit(
'''
lines = tuple(itertools.repeat(string.ascii_letters, N))
print(*lines, sep=newline, file=sys.stderr)
''',
'''
import itertools
import string
from __main__ import newline, N
'''
)
print(f'`sep`:         {time:.1e} s')

I ran this with

$ python3 printtest.py 2>/dev/null
concatenation: 1.6e+01 s
`str.join`:    3.3e+00 s
`sep`:         2.1e+01 s

to suppress the output of the inner print calls.

Again, I can make a pull request if this would be a good change.

caesar0301 commented 4 years ago

@ben-e-whitney 's solution seems more elegant to be compatible. :+1:

leonardbinet commented 4 years ago

@ben-e-whitney my bad I misread the issue 😅 I'll create a dedicated issue for my issue.

What I meant was that having a print to stoud in a library is a bad practice, see for instance this article

It's also bad practice to ship a package that prints directly to stdout because it removes the user's ability to control the messages.

leonardbinet commented 4 years ago

here is the issue I created: https://github.com/caesar0301/treelib/issues/141

vallsv commented 3 years ago

Hi. Any progress here?

Is there a strong argue against? Cause i don't think it makes sense to have backward compatibility here, and in this case the patch is really really small.

I can create a PR if you like.