ampinzonv / BB2

BioBash UN official repository
Other
3 stars 1 forks source link

Determine the optimal number of default cores #5

Open ampinzonv opened 1 year ago

ampinzonv commented 1 year ago

CONTEXT During installation BioBASH gets the number of cores in the system and sets the ambient variable: $BIOBASH_CORES. This variable is therefore used for os::default_number_of_cores to determine the default number of cores that BioBASH scripts will use.

PROBLEM We have not implemented a consistent method to determine the "optimal" default number of cores to use based on the system. So by default BioBASH sets this default to: 1 core. Basically because of this:

By default BioBASH uses 1 core, one attempt was to use a fraction of total cores in the system (something like: 1/3(total number of cores), nevertheless this approach was dismissed, since in huge infrastructures this could cause more harm than benefit, because we could be trying to split a small process in several processes and make things slower (this has not been tested throughly. Needs to be done). One could make a bet and set the default number of cores to "2", because there is a low, low probability that someone is trying to run BioBASH in a single-core processor CPU (such as the Celeron G470 or an Intel Atom), but no one knows.

SOLUTION Use the BIOBASH_CORES variable to set reasonable limits for default number of cores. For instance in huge systems with, let's say, 100 cores, one could set a reasonable number of cores assuring that it WON'T MAKE SMALL PROCESSES SLOWER, but that intermediate to large processes will run smoothly. If the systems has a low number of cores this limits should also be assesed. Thats the solution that we want to find.