es-tooling / esperf

A command-line utility for detecting and fixing performance problems
MIT License
4 stars 3 forks source link

Parallelise file scanning/fixing #9

Open 43081j opened 1 month ago

43081j commented 1 month ago

We currently do an inefficient blocking iteration similar to this:

for (const file of files) {
  await doWork(file);
}

This won't fly in larger projects as it'll be far too slow

We should see if we can parallelise (multi-thread) the work. Something like this:

// runs 200 tasks at any one time (max)
// executes tasks in a worker or something
await mapAsyncWithLimitInWorkers(files, (file) => doWork(file), 200);

ideas welcome

VandeurenGlenn commented 1 month ago

@43081j Maybe good idea to add scan speed option?

image

43081j commented 1 month ago

i think it might confuse people if we expose the underlying algorithm like that

probably better we do what other libraries do: num_of_cpus / 2 or some such thing

VandeurenGlenn commented 1 month ago

i think it might confuse people if we expose the underlying algorithm like that

probably better we do what other libraries do: num_of_cpus / 2 or some such thing

Tbh in my case, cpu overclocked to 5.17ghz (from 3.8) and having the cpu at 100% while scanning for 6 minutes multiple times the temp only got to 65 Celsius so would be nice to have an option/flag to use all cores since disabling 2 threads on my 3750k means 3min longer scanning (I only have 4 threads 😢).

43081j commented 1 month ago

maybe but it doesn't make much sense as an option presented to the user. they shouldn't need to care how the code works under the hood as long as the scan works.

it would be better to choose a sensible number of threads automatically. it isn't a user concern

VandeurenGlenn commented 1 month ago

Hmm, well as an option in cli it's annoying ofc cause the extra key press, But as a flag would be a nice extra for hardcore users and that wouldn't bother normal users.

Thing is, almost any scanner I can think of atm also provides the ability to set scan speeds (virus/malware scanners, duplicate file scanners etc.) and for me personally would save a lot of time when using esperf.

VandeurenGlenn commented 1 month ago

@43081j placed the option behind --advanced for now (if false (default) scanSpeed option not showed in the cli), Also, would advice the interacti cli being behind a --interactive/-i flag like npm-check-updates

43081j commented 1 month ago

yes of course. see this as a PoC for now

once we're "stable", there should be an interactive mode and a mode run by CLI flags

as for selecting "speed", it doesn't make much sense to choose "slow", etc. you would expose the number of threads, if anything. so we could possibly do that behind a flag (--parallelism=n or --num-threads=4, etc) but it is low priority

importantly, it wouldn't ever result in a prompt (whether interactive mode or not, it'd always be a flag)