open-connectome-classes / StatConn-Spring-2015-Info

introductory material
18 stars 4 forks source link

Computational Efficiency #251

Open mrjiaruiwang opened 9 years ago

mrjiaruiwang commented 9 years ago

Intel has been cranking out better and better CPUs for the last few decades, but I wonder if we are harnessing the full potential of the hardware increase. By contrast, we are still using code written in the 70s to do all of our computation. For example, the idea of doubling arrays when trying to insert into a full array remains unchanged despite there being no significance in doubling. Where should we look in software that could potentially increase our current computational capacity so that exponentially-complex problems we thought were unsolvable before could now be done?

wrgr commented 9 years ago

A computer vision in the field perspective: Many of our largest datasets today are EM data. Database reads and writes are enormously important to get right, and are often a very significant portion of our workflow. Moving around TB (PB/EB?) is tough and requires new approaches.

wrgr commented 9 years ago

Also, algorithmically, the two major bottlenecks right now are our membrane detection steps, which rely on deep learning, and a step that forms a region adjacency graph across a large number of nodes - efficient implementations are hard to come by, and scalability is a challenge.

akim1 commented 9 years ago

One limitation is also the instruction set of the hardware architecture. This is not really software, but if there are certain core operations that you rely on, you can just hard-code it into the hardware which would be faster than software.

But with cloud computing, it seems like you can just go on Amazon and harness massive amounts of computing at reasonable costs. In this respect, your software can be "sloppy" and you can use archaic code as long as the problem is parallelizable.