brainvisa / aims-free

Analysis of Images and Signal for neuroimaging
Other
6 stars 3 forks source link

Search and nuke unused code #62

Open ylep opened 3 years ago

sapetnioc commented 2 years ago

I wrote the following script to identify commands that are used in sources (*.py, *.cc and *.cpp) :

import os
import re
import sys

exclude_commands = {'bv', 'brainvisa'}
source_extensions = {'.py', '.cc', '.cpp'}

commands = set(os.listdir('/casa/host/bin')) - exclude_commands

r = re.compile(rf'\b({"|".join(commands)})\b')

# In usage directory, associate to each command used at least in a source file
# a dictionary containing the source file full path associated to the line numbers
# where the command is used. This is too much to only display unused command
# but could be useful in another context.
usage = {}
for dirpath, dirnames, filenames in os.walk('/casa/host/src'):
    for f in filenames:
        filename_printed = False
        _, ext = os.path.splitext(f)
        if ext in source_extensions:
            fp = os.path.join(dirpath, f)
            try:
                content = open(fp).read()
            except UnicodeDecodeError:
                content = None
            if content is None:
                try:
                    content = open(fp, encoding='latin-1').read()
                except Exception as e:
                    print('ERROR reading ', fp, ':', str(e), file=sys.stderr)
                    continue
            for m in r.finditer(content):
                command = m.group(1)
                line_number = content[:m.start(1)].count('\n')+1
                usage.setdefault(command,{}).setdefault(fp,[]).append(line_number)

# Print unused commands
unused = sorted(commands.difference(usage))
for c in unused:
    print(c)
print()
print(len(unused), 'commands unused in sources')
sapetnioc commented 2 years ago

It gives 315 unused commands but we can discuss the source selection. Do you think these three extensions is enough ?

denisri commented 2 years ago

Some commands might be used in other command scripts (with no extension) but I don't think this happends. Now, a buch of commands are not used in any script/process/pipeline but are still used manually by users, so tha does not mean that we can safely throw away all those 315 commands. But at least this can help us to mark those which are actually used in scripts, and discuss the other ones.

sapetnioc commented 2 years ago

Here is the list of unused commands.

Commands to keep

Commands to deprecate/remove

Commands to review

sapetnioc commented 2 years ago

We can move the commands from one list to another (drag and drop). The checkbox means that the choice is certain and final. Unchecked commands indicate a a decision to take.

ylep commented 2 years ago

I just added a category for unreviewed commands.

One thing that really pollutes the bin directory are the test commands (syntax_test, counter_test, etc.). They have to be in the installed package because we need them for, well... tests... but maybe they could be moved to another directory?

sapetnioc commented 2 years ago

Well, I was trying to move a command at the same time. Your modification is gone sorry. List is too long to use drag and drop to move commands.

denisri commented 2 years ago

Right: drag & drop is definitely not convenient here: the list doesn't scroll during drag, and items from the end of the list have to be moved about 10 times to reach their destination, with 2-3 seconds lags after each move. We have to manage the list another way.

sapetnioc commented 2 years ago

We can edit the comment and move the commands freely (adding a "x" if necessary to check the checkbox but with 3 categories the checkboxes seems to be useless). We just have to be careful that two people do not edit at the same time.