Multiprocessing - Githubissues

danlivingston commented 4 years ago

Enable multiprocessing for calculations to run faster on multi-core CPUs. Thought of this when running for trade routes with 500 LY jump range to see if FC trading is profitable.

Implementing this in python is an absolute PITA so understandable if wontfix or delayed indefinitely.

aadler commented 4 years ago

Doesn't the hop at n+1 depend on the hop at n? Multiprocessing may be helpful to check k runs at a time at each hop instead of 1 by 1.

slevir commented 4 years ago

I've looked through the code and had this idea, which should theoretically allow to multiprocess trade runs without rewriting the code too much. I'm not a python guy, but I can try implementing this, and @eyeonus may I run the idea by you first, just in case it has some glaring holes I didn't notice?

Idea: Split the processing on the second (maybe in the future third etc, but baby steps) hop. Among all possible second hop destinations, workers only process predetermined subset (just skipping the processing for any systems not allocated for them).

Conditions:

3 or more hops
the DB stays untouched throughout the run

Implementation outline:

implement some kind of orchestrator function, which will spawn and manage actual processes to do the job. Orchestrator doesn't really do much other than passing down parameters, reading worker processes' output and tallying results.
add some kind of this-worker-number, worker-count parameter to run subcommand, for the next item on this list
when selecting second hop destinations (tradedb.TradeDB.getDestinations() seems to be the place to do this), only select those assigned to the current worker. The getDestinations() result seems to be consistent given the same parameters, so each worker can calculate it independently, and for N workers we just split the list into N chunks, with C'th worker getting C'th chunk. (the last chunk will be a bit bigger than others but no dig deal). Unfortunately, unless there's some cool python feature I don't know, this means getDestinations() would have to pre-calculate the whole list, at least for the second hop
once calculation is complete, workers put out their best route in some machine-readable format (something text-based like JSON will probably make it easy, since we won't need any inter-process communication other than what the shell provides)
last, when all workers finished, the orchestrator process just compares N probably-best routes to figure out the best one and prints it out in human-readable form

Final notes:

each working calculating their second hop destination list seems to be a lot of extra work that can be avoided, especially given that in my profiling TD spends about half its running time in getDestinations(), but I wanted this to work with passing the least amount of data between the orchestrator and the workers and requiring the least amount of modifications to existing algorithms. The other thing is that the extra work will only happen on the second hop, all the further hops will be working off the reduced set of second hops and as such shouldn't be duplicating.
this will also not play too well with --start-jumps, since each worker will also have to independently calculate the first hop (which is trivial when you have only a few origins), but, well, this can probably be adopted to make the "splitter hop" selectable

What do you people think?

eyeonus commented 4 years ago

The code that does the trade runs is a recursive algorithm, so splitting at top level isn't theoretically hard. The problem you may run into is that it almost always splits into more than two at the top level. Keep in mind a program can only have as many processes as the computer has processors, so there is a hardware limit to the number of processes a multi-process algorithm can utilize.

The algorithm splits for every system within range of the origin that passes all the requirements for consideration, which can be as little as 0 out past the bubble, to very very high numbers in densely populated areas.

The database is only ever written to when using the import or update sub commands. Calculating a run only needs to read the database, so that condition is kind of a given.

Feel free to work on this if you want. It's open source for a reason, so go ahead and make yourself a fork, and once you have a working build, send me a PR.

jonathan-irvin commented 3 years ago

So, I just added #79 which could help with that. Docker is great for splitting things off into workers if that's needed.

eyeonus commented 3 years ago

Turning TD into a self-contained VM would not help at all with this.

cpeosphoros commented 3 years ago

I don't code in Python, but I know of a couple or three of muti-threaded Monte Carlo simulation implementations for Python out there. Plugging them in should not be a hard, albeit not trivial either, task for someone who do code in Python.

aadler commented 3 years ago

General monte Carlo simulations can be done in parallel since current evaluations do not depend on prior values. TD, however, is creating a path, and the “value” (profit) at the end node depends on all prior visited stations, so it’s a serial calculation instead of parallel. One may be able to calculate all child nodes from current in parallel, perhaps, but not all paths from origin.

Avi

On Sun, Jan 3, 2021 at 7:10 AM Chronos Phaenon Eosphoros < notifications@github.com> wrote:

I don't code in Python, but I know of a couple or three of muti-threaded Monte Carlo simulation implementations for Python out there. Plugging them in should not be a hard, albeit not trivial either, task for someone who do code in Python.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/eyeonus/Trade-Dangerous/issues/75#issuecomment-753607636, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKWJ4OE46S37YWL3Y7EMX3SYBND7ANCNFSM4O2PQC3A .

-- Sent from Gmail Mobile

Karmak23 commented 3 years ago

Even if it can be difficult, mathematically speaking every recursive function has an iterative version. In most common cases, it has to deal with a queue of "things" (fifo, lifo, that depends on the recursive fonction) that gets filled along the way.

Still speaking in computing theory, this queue is to be consumed by N processes (number of processors, 2x number of processors, etc), which in turn put their results in the queue. If I understand correctly, the multiple paths are going to be constructed in waves, one hop for each path at a time.

The mechanics are very common, this is "just" browsing a graph, level after level.

Another approach is : given you already have the algorithm to construct one or more paths (limited by jumps/hops ; not yet computing traded goods at this time, only valid path), we could also feed these multiple paths into the queue, each one beiing given to a consumer process that will evaluate "the trading benefits" from start to end, associate the benefit with the path, discard the path if not matching a thresehold, and storing the path+benefit somewhere (an array ? whatever). Multiple path are thus evaluated in parallel. when queue is empty, master process sorts the array by benefit, and displays N firsts to the user. (just a theorical exemple)

the "one or more feeder / queue / one or more consumers" is a very robust pattern in multiprocessing programming, and in python (all the semantics and tools are already included since long enough I forgot it ; been coding python since 2006).

I once coded a web crawler / analyzer with multiple consumers on multiple computers and configurable patterns in consumers. I will drop an eye on TD as soon as I find the time (i'm not in the computer business any more, but growing an educational farm…). Not this week because there are a lot of kids 'til end of holidays.

bahaa32 commented 1 year ago

I have been privately working on multithreading (not clean enough for a PR yet) and have accomplished significant progress (ie I believe that the multithreading is 90% complete) but have encountered one major hurdle: sqlite does not support multithreading/multiprocessing yet that is required to complete the innermost getBestHop loop

I'm a bit stuck on this so any advice would be greatly appreciated. Where I'm trying to head right now is proxying the requests through a custom multiprocessing Manager object or queue such that the main process can respond.

EyeMWing commented 10 months ago

@bahaa32 Two general approaches.

1) As you said, run every database query through a single object that will pass the queries on to the database one at a time, creating a choke point and probably killing a lot of the benefit of the entire idea 2) The memory expensive approach - throw sqlite out the window for the actual route search. Before you split to threads, just select * all the relevant tables into collections in memory, and search those instead. Memory expensive, but gaming PCs tend towards having some to spare.

Tromador commented 6 months ago

Going nowhere and slowly at that. Making much more progress on optimizations of what we have.

eyeonus / Trade-Dangerous

Multiprocessing #75