GeorgiadouAntigoni / weak_lensing_machine_learning

1 stars 0 forks source link

Test deeper architecture #5

Open bnord opened 5 years ago

bnord commented 5 years ago

Test a deep convolutional network with ReLU and same number of neurons as in Arch0Dense from #2

GeorgiadouAntigoni commented 5 years ago

gridtests_cnn_alp5-seed10-epochs500

I have been testing a deep CNN with ReLU with the same architecture as Arch0Dense. This is a plot of the sigmas versus the number of neurons of the layers alp =5, # of seeds =10 and epochs = 500 . I am running for more seed options (100) but the colab keeps stopping the process, so I have to restart. Can we run in Wilson cluster?

bnord commented 5 years ago

yep, we can run on wilson. We'll need to get you an account. Can you email Amitoj G Singh amitoj@fnal.gov to ask for an account?

bnord commented 5 years ago

Why does colab stop the process? Is it after a certain amount of time?

bnord commented 5 years ago

There’s also a wall time on Wilson. We'll want to know what that is. We might need to save the network and re-start whether on Wilson or on colab.

GeorgiadouAntigoni commented 5 years ago

I noticed that Colab stops before the 12 hour limit. I think it is not intended for long-running tasks.

From the Colab FAQs web page:

"Colaboratory is intended for interactive use. Long-running background computations, particularly on GPUs, may be stopped. Please do not use Colaboratory for cryptocurrency mining. Doing so is unsupported and may result in service unavailability. We encourage users who wish to run continuous or long-running computations through Colaboratory’s UI to use a local runtime."

I save the instances in .h5 files though, so it doesn't start from scratch.

bnord commented 5 years ago

Should we try wilson?

GeorgiadouAntigoni commented 5 years ago

Yes, let's try. I'll ask for an account.