Closed ardila closed 9 years ago
Time breakdowns that I've observed are basically:
1) gpu computation itself:
for large models = < 2s per batch
for smaller models (e.g. Krizhevsky) = ~ .3s per batch
2) loading the data:
(from written batches) > 10s per batch
with optimizations = < 1s per batch
3) img cropping, when img_flip is True = ~ 1s per batch
So I think doing the optimizations is probably a good idea even with the larger models. On Sep 30, 2014 10:54 AM, "Diego Ardila" notifications@github.com wrote:
@yamins81 https://github.com/yamins81 have you checked how much the optimizations you've done for loading data onto the GPU have an effect when the network is larger (say, the NYU network?). I have a suspicion that these optimizations are less relevant when the GPU compute time is higher since in theory the CPU should be able to get the next batch ready while the GPU runs, so I was wondering if you've checked this.
— Reply to this email directly or view it on GitHub https://github.com/dicarlolab/archconvnets/issues/29.
I'm going to look tomorrow. On Oct 2, 2014 6:24 PM, "Darren Seibert" darren@mit.edu wrote:
have you taken a look?
On Wed, Oct 1, 2014 at 2:14 PM, Darren Seibert darren@mit.edu wrote:
ok. I'll have re-structured the remaining sections by that time, so you'll be able to look at it in total.
On Wed, Oct 1, 2014 at 2:12 PM, Dan Yamins dyamins@gmail.com wrote:
ok, my plan is to took at this tomorrow morning
On Wed, Oct 1, 2014 at 2:11 PM, Darren Seibert darren@mit.edu wrote:
the significance section of the research strategy is also ready for editing.
I'm still re-structuring the innovation and approach sections
On Wed, Oct 1, 2014 at 2:03 PM, Darren Seibert darren@mit.edu wrote:
Can we chat about this tomorrow (Monday) early afternoon?
Thanks, Dan
On Thu, Oct 2, 2014 at 7:24 PM, Dan Yamins dyamins@gmail.com wrote:
I'm going to look tomorrow. On Oct 2, 2014 6:24 PM, "Darren Seibert" darren@mit.edu wrote:
have you taken a look?
On Wed, Oct 1, 2014 at 2:14 PM, Darren Seibert darren@mit.edu wrote:
ok. I'll have re-structured the remaining sections by that time, so you'll be able to look at it in total.
On Wed, Oct 1, 2014 at 2:12 PM, Dan Yamins dyamins@gmail.com wrote:
ok, my plan is to took at this tomorrow morning
On Wed, Oct 1, 2014 at 2:11 PM, Darren Seibert darren@mit.edu wrote:
the significance section of the research strategy is also ready for editing.
I'm still re-structuring the innovation and approach sections
On Wed, Oct 1, 2014 at 2:03 PM, Darren Seibert darren@mit.edu wrote:
Sure! I have class 1-3 though On Oct 5, 2014 9:36 PM, "Dan Yamins" notifications@github.com wrote:
Can we chat about this tomorrow (Monday) early afternoon?
Thanks, Dan
On Thu, Oct 2, 2014 at 7:24 PM, Dan Yamins dyamins@gmail.com wrote:
I'm going to look tomorrow. On Oct 2, 2014 6:24 PM, "Darren Seibert" darren@mit.edu wrote:
have you taken a look?
On Wed, Oct 1, 2014 at 2:14 PM, Darren Seibert darren@mit.edu wrote:
ok. I'll have re-structured the remaining sections by that time, so you'll be able to look at it in total.
On Wed, Oct 1, 2014 at 2:12 PM, Dan Yamins dyamins@gmail.com wrote:
ok, my plan is to took at this tomorrow morning
On Wed, Oct 1, 2014 at 2:11 PM, Darren Seibert darren@mit.edu wrote:
the significance section of the research strategy is also ready for editing.
I'm still re-structuring the innovation and approach sections
On Wed, Oct 1, 2014 at 2:03 PM, Darren Seibert darren@mit.edu wrote:
— Reply to this email directly or view it on GitHub https://github.com/dicarlolab/archconvnets/issues/29#issuecomment-57961988 .
@yamins81 have you checked how much the optimizations you've done for loading data onto the GPU have an effect when the network is larger (say, the NYU network?). I have a suspicion that these optimizations are less relevant when the GPU compute time is higher since in theory the CPU should be able to get the next batch ready while the GPU runs, so I was wondering if you've checked this.