mawww / kakoune

mawww's experiment for a better code editor
http://kakoune.org
The Unlicense
9.9k stars 713 forks source link

[Feature request] Pipe selections to shell command in parallel #2634

Open JJK96 opened 5 years ago

JJK96 commented 5 years ago

Currently when piping multiple selections through/to a shell command it seems to happen in series, one after the other. When performing an operation on a lot of selections the performance could greatly be improved by making this parallel, so all selections would be processed at the same time.

lenormf commented 5 years ago

I think the current stance on parallelism is that it should not be used unless there's absolutely no other option, since it comes with caveats of its own (e.g. synchronization) that would complexify the codebase.

If you need to modify the current selection asynchronously, you could use the { printf %s … | kak -p … } </dev/null >/dev/null 2>&1 idiom, that's what most plugins that parse the buffer do (e.g. spell).

mawww commented 5 years ago

That is different though, we are already kind-of parallel when we run a shell, Kakoune is running in parallel to the shell script (although it does not do much). What is proposed here is to run all the shell scripts that will run for each selection at the same time. This does not really require multithreading, just forking all those, then waiting for every one of the shells to exit.

Provided this does indeed improve performances and is not too complex to implement, it could be nice, however I doubt it would really improve performances.

JJK96 commented 5 years ago

The amount of performance gain probably depends on the complexity of the task that is executed, but in principle it should be better to do it in parallel, so that all those programs are running at the same time.

occivink commented 5 years ago

Could this also be done when parsing multiple %sh blocks that belong to the same level (same command or same string) ? e.g.

echo "%sh{ printf abc } %sh{ printf def }"
echo %sh{ printf abc } %sh{ printf def}
JJK96 commented 5 years ago

I've made an initial implementation of asynchronous pipe, however I get a lot of exceptions due to it. I am not very well-versed in c++ so I could use some help.

I currently get a lot of random issues, so probably some functions I use are not thread-safe. However, when disabling a lof of functions (as shown below) I still get an error about 50% of the time, however this is a different one:

uncaught exception (St9bad_alloc):
std::bad_alloc
    handles.push_back(std::async(std::launch::async,[&]() {
            /*String out = ShellManager::instance().eval(cmdline, context, in,
                ShellManager::Flags::WaitForStdout).first;*/
            String out = "test\n";
            /*if (insert_eol)
            {
                in.resize(in.length()-1, 0);
                if (not out.empty() and out.back() == '\n')
                    out.resize(out.length()-1, 0);
            }*/
            return result {beg,in,out};
    }));

}
for (auto &handle: handles) 
{
    auto res = handle.get();

    //apply_diff(buffer, res.beg, res.in, res.out);

    //changes_tracker.update(buffer, timestamp);
}
mawww commented 5 years ago

bad_alloc means an allocation failed, probably due to memory usage become too high. But in any case, the implementation of this feature should not require multithreading, its mostly a matter of forking all the necessary processes, and handling their data streams.

I would expect an implementation of this to use say a Vector<PipeProcess> with struct PipeProcess { pid_t pid; PipeWriter stdin; PipeReader stderr, stdout; }, with PipeWriter/PipeReader refactored to contains whats in Pipe and to own the contents strings. Then its just a matter of letting the event manager dispatch events to those and wait for all forked processes to have finished.