runtimeverification / javamop

Runtime verification system for Java, using AspectJ for instrumentation.
http://fsl.cs.illinois.edu/javamop
MIT License
45 stars 37 forks source link

Usefulness of property *_StaticFactory #236

Open emopers opened 8 years ago

emopers commented 8 years ago

We have seen large amount of violations for {Boolean, Byte, Character, Double, Float, Long, Integer, Short}_StaticFactory properties during runtime monitoring. However, we don't feel these properties very useful in monitoring most appplication programs. Though it is stated in Java API that using StaticFactory) "is likely to yield significantly better space and time performance by caching frequently requested values.", we think that 1) these performance bugs are less concerned for common developers than functional bugs and 2) according to our experiment, using static factory seems not render better space and time performance. Following is the snippet we used for experiment:

import java.util.Random;
public class TestStaticFactory {`
    public static void main(String[] args) {
        long startTime = System.nanoTime();
        int[] arr = new int[100];
        Random r = new Random();
        for(int i = 0; i < 1000000000; i++) {
            int num = r.nextInt(100);
            Integer result = nonStaticFactory(num);
        }
        long elapseTime = System.nanoTime() - startTime;
        System.out.println(Runtime.getRuntime().maxMemory());
        Runtime.getRuntime().gc();
        System.out.println("static " + elapseTime);
        startTime = System.nanoTime();
        r = new Random();
        for(int i = 0; i < 1000000000; i++) {
            int num = r.nextInt(100);
            Integer result = staticFactory(num);
        }
        elapseTime = System.nanoTime() - startTime;
        System.out.println(Runtime.getRuntime().maxMemory());
        System.out.println("non-static " + elapseTime);
    }
    private static Integer nonStaticFactory(int num) {
        return new Integer(num);
    }
    private static Integer staticFactory(int num) {
        return Integer.valueOf(num);
    }
}

With int within 1000 being processed 1 billion times, the caching effects of static factory should be significant. However, the difference in time consumption of static factory and non-static factory methods were all within %10 running many times, and neither of them always out-wins another. The maximum memory usage of these methods were the same. The result is similar if change the code to call non-static factory method before static factory method. In all, though these properties might be useful for high-performance computing, we do not think these properties provide more benefits to common developers than the noise they introduce.

xiaohe27 commented 8 years ago

@emopers I get the same result by executing the code you provided, but if I change the number of iterations to a smaller number, say one hundred thousand, the static factory performs much better than (twice to three times as fast as) non-static version in terms of time complexity. The space usage were the same.

I do not know what is the common practice of using these libraries. But for the projects which does not use millions of integers, the static library has a better time result.

xiaohe27 commented 8 years ago
import java.util.Random;

public class TestStaticFactory {
    public static final int BILLION = 1000000000;
    public static final int MILLION = 1000000; //about the same
    public static final int HundredThousand = 100000; //try this.
    public static final int TenThousand = 10000; //time: non-static Vs static about 2:1

    public static final int ITERS = HundredThousand;

    public static void main(String[] args) {
        int base = 1000;

        long startTime = System.nanoTime();

        int[] arr = new int[100]; //what is the purpose of this int array?

        Random r = new Random();
        for (int i = 0; i < ITERS; i++) {
            int num = r.nextInt(base) ;
            Integer result = nonStaticFactory(num);
        }
        long elapseTime_nonStatic = System.nanoTime() - startTime;
        long nonStaticMem = Runtime.getRuntime().maxMemory();
        System.out.println("Max mem for non-static case is " + nonStaticMem);
        long nonStaticTotalMem = Runtime.getRuntime().totalMemory();
        System.out.println("Total mem for non-static case is " + nonStaticTotalMem);

        Runtime.getRuntime().gc();
        System.out.println("time for non-static case is : " + elapseTime_nonStatic);

        startTime = System.nanoTime();
        r = new Random();
        for (int i = 0; i < ITERS; i++) {
            int num = r.nextInt(base) ;
            Integer result = staticFactory(num);
        }
        long elapseTime_static = System.nanoTime() - startTime;

        long staticMem = Runtime.getRuntime().maxMemory();
        long staticTotalMem = Runtime.getRuntime().totalMemory();

        System.out.println("Max mem for static case is " + staticMem);
        System.out.println("Total mem for static case is " + staticTotalMem);

        System.out.println("Time for static case is: " + elapseTime_static);

        System.out.println("Max memory: non-static Vs static is " +
                (nonStaticMem / (double) staticMem) + " : 1");

        System.out.println("Total memory: non-static Vs static is " +
                (nonStaticTotalMem / (double) staticTotalMem) + " : 1");

        System.out.println("Time: non-static Vs static is " +
                (elapseTime_nonStatic / (double) elapseTime_static) + " : 1");
    }

    private static Integer nonStaticFactory(int num) {
        return new Integer(num);
    }

    private static Integer staticFactory(int num) {
        return Integer.valueOf(num);
    }
}

I modified the original code to make the result easier to interpret.

emopers commented 8 years ago

Hi @xiaohe27 , thanks for the feedback! We tried out the snippet and indeed the overhead is very obvious given smaller iteration. However, did you try to switch the calling order of the two methods? For example with the same parameters and run staticFactory() before nonStaticFactory(). Because we found that due to some reasons the first method got called always took longer time. And with calling TenThousand times with staticFactory() being called first, staticFactory() method will have larger time consumption. And with TenThousand times compared with BILLION, the overhead of the first method being called would be more obvious. I don't know why this happen also.

xiaohe27 commented 8 years ago

@emopers My intuition is it may have something to do with caches. Some time ago when I try to optimize the performance of reading large log file, I noticed that if I read the same log file via my log reader several times consecutively, the first time to read it will take significantly longer time. If after the first time of reading the log file, I clean the disk caches explicitly, then the next reading will not be faster any more. The above code does not read contents from disk, but may still have something stored in the cache to speedup the later access. So I would suggest to do the experiments separately and then compare the results.