Profiling ASMOSES - Githubissues

Bitseat commented 4 years ago

Introduction As it is explained in the git issue #69, asmoses is about 1.35x slower when atomspace port is enabled without subprogram memoization, and 1.15x slower with subprogram memoization. The goal of this profiling is to find where the ported program spends too much of its time and improve those places to have asmoses run at least as fast as moses when atomspace-port is enabled, and faster when subprogram memoization is enabled. First, a summary of profiling results with and without porting atomspace is explained then places that need improvement are suggested.

Result As an example, by running these two commands,valgrind --tool=callgrind asmoses -Hmux -k3 -m2000 -p0.0001 --atomspace-port=1 and valgrind --tool=callgrind asmoses -Hmux -k3 -m2000 -p0.0001, we can view and analyze the results (callgrind.out.4138 and callgrind.out.5406) on kcachegrind. As an alternative, you can view snips of the graphs here and here.

valgrind --tool=callgrind asmoses -Hmux -k3 -m2000 -p0.0001 --atomspace-port=1

opencog::moses::atomese_based_scorer::operator()(std::vector) - 92.36%
opencog::moses::logical_bscore::operator()(opencog::Hadle...) - 91.15%
opencog::combo::complete_truth_table::populate(opencog::Handle co..) - 91.18%
    opencog::atomese::Interpreter::operator()(opencog::Handle cons...) - 16.22%
        opencog::atomese::Interpreter::execute(unsigned short, std::vect...)- 16.21%
    opencog::combo::complete_truth_table::populate_features(std::vecto...)- 51.07%
        opencog::Handle opencog::createLink - 46.23%
    opencog::combo::complete_truth_table::setup_features(opencog::Hand..) - 13.38%
        opencog::Atom::setValue(opencog::Handle const...) - 10.31%

valgrind --tool=callgrind asmoses -Hmux -k3 -m2000 -p0.0001

opencog::moses::combo_based_scorer::operator()(std::vector... ) - 55.51%
    opencog::moses::representation::get_candidate(std::vect...) - 14.68%
    opencog::moses::logical_bscore::operator()(opencog::tree...)- 40.40%
        opencog::combo::complete_truth_table::complete_truth_table - 38.83%
           opencog::combo::boolean_interpreter::operator()(opencog::tr...) - 31.58%

As the result where atomspace-port is enabled depicts, too much of the CPU processing time is taken by complete_truth_table::populate. It clearly shows the functions Interpreter (16.22%), populate_features (51.07%), and setup_features (13.38%) taking the vast majority of the time of our program, and we can see that populate_features has most of its time spent by the createLink function.

Plan Removing the repetitive call to the createLink is expected to improve the performance greatly.

Note: related profiling results can be found here and one table based problem result here

ngeiswei commented 4 years ago

Thanks for including the dot.ps files, @Bitseat, however could you enable "relative" when generating them, as the absolute CPU allocation numbers are difficult to interpret?

ngeiswei commented 4 years ago

BTW, I think you can include the dot files instead, no need to convert to ps, hopefully I can visualize them.

Bitseat commented 4 years ago

@ngeiswei, We can export the graph as DOT file or as an image. Exporting it as DOT file can only show us the absolute costs even if "relative" is enabled (look ported and Unported). So I think it is better exporting it as an image. look here and here. A few more images are also added into the other profiling test folders.

singnet / asmoses

Profiling ASMOSES #108