Open adityajain07 opened 1 month ago
Another comment:
CPUs != processes != parallel != faster You gotta know where your probable bottleneck is. You will not get more than about 100MB/s down, in almost all certainty, through our internet pipe. Our filesystems are much faster than that so they're not the bottleneck. One CPU core can handle a download program and the disk I/O that 100MB/s through-traffic generates. Downloading images one-by-one is not automatically bad. It might be bad if, between downloads, there is dead time (from writing out the file, or selecting the next URL, or any other reason). It might be worthwhile, therefore, to have a handful of downloads going in parallel to saturate the network connection. A handful, here, is determined by that dead time, which is a property of your download code's (in)efficiency. It is not determined principally by CPU core count, in fact it is almost completely independent of it, and it is definitely not 64. It is extremely likely that 4-8 parallel downloads on 1 CPU core will saturate the download bandwidth entirely. One CPU core can handle almost any number of processes so long as these processes are mostly sleeping waiting for I/O and using negligible CPU% - which is likely the case for you. If you've tied the number of downloads to the number of cores, that's a mistake. Remove that tie. It's really got nothing to do with CPU or GPU usage efficiency - a download job is principally about moving data and just about any single CPU core ought to be adequate for all but the highest-performance downloads on the highest-performance networks and networked filesystems.
Re-emphasizing: Having more than 1 downloading processes might help you reach the max bandwidth of probably 100MB/s, but adding more processes than the minimum required will only slow down every other process's download. Furthermore, since most of the time these processes will be sleeping waiting for data to come in/out, they won't be using the CPU, which can then be time-sliced between all of them. That's why you only need one CPU core, might only need a small handful of download processes, and definitely not 64 cores.
Suggestion by the IDT team:
Another suggestion on not asking for multiple CPUs: