jclouds / legacy-jclouds-cli

https://jclouds.apache.org
Apache License 2.0
17 stars 5 forks source link

jclouds-cli starts up slowly #41

Closed gaul closed 9 years ago

gaul commented 11 years ago

I measure 3 seconds for even simple commands, seems like entirely startup overhead:

$ time jclouds-cli-1.6.0-SNAPSHOT/bin/jclouds blobstore container-exists jclouds-cli2
File not found: Container does not exist: jclouds-cli2

real    0m3.172s
user    0m3.460s
sys     0m0.152s
gaul commented 11 years ago

This issue is misattributed: YourKit reports that 84% of CPU time is spent in ContextBuilder.newBuilder and ContextBuilder.build. This is a general jclouds issue, not Karaf or CLI specifically.

codefromthecrypt commented 11 years ago

Users aren't concerned as much about CPU time vs elapsed time. Is there a way you can transparently break down this with multiple data points vs jumping to conclusions? Also I think we should consider 2 other things. Compare against über jar and compare against über jar where service loader is not used.

On Friday, February 8, 2013, Andrew Gaul wrote:

This issue is misattributed: YourKit reports that 84% of CPU time is spent in ContextBuilder.newBuilder and ContextBuilder.build. This is a general jclouds issue, not Karaf or CLI specifically.

— Reply to this email directly or view it on GitHubhttps://github.com/jclouds/jclouds-cli/issues/41#issuecomment-13305472..

codefromthecrypt commented 11 years ago

here's some comparison data from blobstore-basics. using the transient blobstore we can separate out any class loading/wiring time vs network time. The results below show that at least on my macbook, everything from booting the JVM to wiring up guice to using BlobStore takes a hair over a second, slightly longer when using string lookup keys.

When maven stops downloading the massive amount of deps needed to try on jclouds-cli, I'll report back on how much time a container list of transient takes there.

jclouds-1.5.7 dep org.jclouds:jclouds-blobstore 5512k all-in

using "transient"

real    0m1.087s
user    0m1.587s
sys 0m0.073s

using new TransientApiMetadata()

real    0m1.097s
user    0m1.621s
sys 0m0.067s

jclouds-1.5.7 dep org.jclouds:jclouds-allblobstore 6572k all-in

using "transient"

real    0m1.136s
user    0m1.625s
sys 0m0.070s

using new TransientApiMetadata()

real    0m1.103s
user    0m1.624s
sys 0m0.072s

jclouds-1.6.0-alpha.2 dep org.jclouds:jclouds-blobstore 4008k all-in

using "transient"

real    0m1.071s
user    0m1.527s
sys 0m0.073s

using new TransientApiMetadata()

real    0m1.073s
user    0m1.535s
sys 0m0.075s

jclouds-1.6.0-alpha.2 dep org.jclouds:jclouds-allblobstore 4904k all-in

using "transient"

real    0m1.219s
user    0m1.623s
sys 0m0.184s

using new TransientApiMetadata()

real    0m1.109s
user    0m1.622s
sys 0m0.076s

Code

import org.jclouds.ContextBuilder;
import org.jclouds.blobstore.BlobStoreContext;
import .org.jclouds.blobstore.TransientApiMetadata;

public class MainApp {
   public static void main(String[] args) throws Exception {
      BlobStoreContext context = ContextBuilder.newBuilder("transient") // or new TransientApiMetadata()
                                               .buildView(BlobStoreContext.class);
      try {
         System.out.println(context.getBlobStore().list());
      } finally {
         context.close();
      }
   }
}
codefromthecrypt commented 11 years ago

Here's the results for 1.6.0-SNAPSHOT for jclouds-cli on my laptop. Since jclouds-cli takes twice the elapsed time, I'm sure we can agree there's a lot of room to improve before optimizing ContextBuilder.

lgml-adrianc2:jclouds-cli adrianc$ mkdir tmp
lgml-adrianc2:jclouds-cli adrianc$ cd tmp
lgml-adrianc2:tmp adrianc$ tar -xzf ../assembly/target/jclouds-cli-1.6.0-SNAPSHOT.tar.gz 
lgml-adrianc2:tmp adrianc$ time ./jclouds-cli-1.6.0-SNAPSHOT/bin/jclouds blobstore list --provider transient --identity foo --credential bar --all

real    0m2.098s
user    0m2.266s
sys 0m0.201s
lgml-adrianc2:tmp adrianc$ du -k ../assembly/target/jclouds-cli-1.6.0-SNAPSHOT.tar.gz
23460   ../assembly/target/jclouds-cli-1.6.0-SNAPSHOT.tar.gz
gaul commented 11 years ago

@adriancole I might misunderstand what you are saying, but profiling

jclouds-cli-1.6.0-SNAPSHOT/bin/jclouds blobstore list --provider transient --identity foo --credential bar --all

still shows most time in ContextBuilder.newBuilder and ContextBuilder.build. Looking through the call tree, it appears that we initialize both the blobstore and compute providers. If we only initialized the specific one we needed could we shave off some run-time?

codefromthecrypt commented 11 years ago

if you actually think loading the compute jars is responsible for doubling even the time to start the JVM, I'll create another example.

gaul commented 11 years ago

I might have measured this incorrectly; I had to hack YourKit to attach to a short-lived process.

codefromthecrypt commented 11 years ago

ok I included jclouds-all dep and also preemptively called Providers.all() and Apis.all(). Clearly there's nothing like a time-doubling impact. Low-hanging fruit is to drop the karaf and massive dist tree stuff and create a single jar that doest the simple commands present in the blobstore cli feature group.

lgml-adrianc2:blobstore-basics adrianc$ time java -jar target/blobstore-basics-jar-with-dependencies.jar
[]

real    0m1.159s
user    0m1.631s
sys 0m0.065s
lgml-adrianc2:blobstore-basics adrianc$ cat ./src/main/java/org/jclouds/examples/blobstore/basics/MainApp.java
package org.jclouds.examples.blobstore.basics;

import org.jclouds.ContextBuilder;
import org.jclouds.blobstore.BlobStoreContext;
import org.jclouds.blobstore.TransientApiMetadata;

public class MainApp {
   public static void main(String[] args) throws Exception {
      for (org.jclouds.providers.ProviderMetadata p : org.jclouds.providers.Providers.all()){
      }
      for (org.jclouds.apis.ApiMetadata a: org.jclouds.apis.Apis.all()) {
      }
      BlobStoreContext context = ContextBuilder.newBuilder("transient")
                                               .buildView(BlobStoreContext.class);
      try {
         System.out.println(context.getBlobStore().list());
      } finally {
         context.close();
      }
   }
}
lgml-adrianc2:blobstore-basics adrianc$ du -k target/blobstore-basics-jar-with-dependencies.jar
9332    target/blobstore-basics-jar-with-dependencies.jar
iocanel commented 11 years ago

@andrewgaul: I think that the cpu is not a good measure. Loading jars from the filesystem and discovering things via the service loader might not be cpu intensive but it still adds a significant delay.

To be more specific:

I measured the overhead of of loading the jars and discovering the commands via java service loader and varies from 160 to 200 ms.

There is an additional overhead of 80 to 100 ms that is the time required from the command processor to parse the command, find the matching class and instantiate it and invoke it.

The rest of time is just the time spent by the actual command. Some of the commands but not all (I think) will retrieve from the shell information like height and width in order to better format the output etc. And also some of the commands will load the properties file with the command output configuration etc.

Now, if we do not wish to discover jars and have a fixed CLASSPATH, then we can significantly decrease the jar loading overhead. Also if we decide not to use the ServiceLoader and go with a fixed set of commands, we can further decrease the overhead. Finally, if we don't care about how the commands will be rendered or have a configurable output we can make things even faster. Its just a matter of what features we want to use.

So for me its just a matter of how much we value extensibility and configurability.