googledatalab / datalab

Interactive tools and developer experiences for Big Data on Google Cloud Platform.
Apache License 2.0
974 stars 249 forks source link

Datalab performance #867

Open VelizarVESSELINOV opened 8 years ago

VelizarVESSELINOV commented 8 years ago

Time to time Datalab is very slow, not sure what it is the reason of this slowness. What it is the best option to debug Datalab usage and follow performance issues?

Concrete question: is it possible to deploy a VM with SSD disk?

ojarjur commented 8 years ago

Sorry for the late reply, I should have gotten back to you much sooner.

For the tracking usage and debugging performance issues, all we have so far is the "Sessions" page. All that shows you is the names of the active notebooks, and we know that is not enough.

We are collecting some thoughts about options for improvement in issue #796, so if there is a particular piece of information you want us to include, then please chime in on that issue.

For the question about deploying a VM with an SSD disk, we unfortunately are not able to support it at this time.

VelizarVESSELINOV commented 8 years ago

Any plan to move full SaaS, scalable instances, if more users are connected more instances, if some scripts are using 100% of the available resources auto scale the available resources?

Scale down as well, when no one is using Datalab.

Tracking performance suggestions:

Di-Ku commented 8 years ago

Thanks for the suggestions. We will look into it.

We want to move towards SaaS but given current resource/release situation, we will likely first have a local version (laptop/desktop/GCE VM) before going in for a fully supported SaaS offering which does take time/resources to build, harden & support.

VM with SSD should be possible with the local option on GCE. However, it will need to use service account because the credentials are accessible to anyone in the project. User creds cannot be stored (even in non-persistent form) on a project-wide resource.

srihari4mbatech commented 6 years ago

Hi,

On google datalab, git clone is too slow. I am trying to clone it with Google Cloud platform trainings available in GitHub. Its very slow.

Is there any specific reason, vm region is us-east1-b and I am accessing from India, South Asia region.