Closed LNGi closed 12 years ago
I can't reproduce this, but it seems like this is a bug in thrift, not snowflake.
This is a known bug in thrift.
It looks like this issue with thrift has been fixed: https://issues.apache.org/jira/browse/THRIFT-601
How did he reproduce it then?
Related: I found another problem related to this that effects HsHaServer. Will follow up with the thrift devs.
can you build snowflake with thrift 0.2.0? that was before the fix (0.3 according to the bug)
I doubt you can. It requires 0.5.0 as a dependency.
too bad I'm not good at java, so I'm may not provide more details about this error...
but I think Ican describe how I get snowflake running and how I got this error, hope this help.
download thrift 0.5.0 from apache.org and install it from source:
./configure --with-erlang=no --with-python=no --with-haskell=no
running sbt
[liang@api: snowflake]$ sbt
> debug
> update
> compile
> package
starting up snowflake manually:
java -server -XX:+UseConcMarkSweepGC -verbosegc \
-XX:+PrintGCDetails \
-XX:+PrintGCTimeStamps \
-XX:+PrintGCDateStamps \
-XX:+UseParNewGC \
-Xloggc:/var/log/snowflake/gc.log \
-Dcom.sun.management.jmxremote \
-Dcom.sun.management.jmxremote.port=9998 \
-Dcom.sun.management.jmxremote.authenticate=false \
-Dcom.sun.management.jmxremote.ssl=false \
-Xmx700m -Xms700m -Xmn500m \
-XX:ErrorFile=/var/log/snowflake/java_error%p.log \
-cp /Users/liang/Projects/snowflake/build/snowflake-1.0.jar:./project/boot/scala-2.8.1/lib/scala-library.jar:./lib_managed/compile/configgy-2.0.1.jar:./lib_managed/compile/zookeeper-client-2.0.0.jar:./libs/zookeeper-3.3.1.jar:./lib_managed/compile/libthrift-0.5.0.jar:./lib_managed/compile/ostrich-4.0.1.jar::./lib_managed/compile/avalon-framework-4.1.3.jar:./lib_managed/compile/commons-codec-1.4.jar:./lib_managed/compile/commons-lang-2.2.jar:./lib_managed/compile/commons-logging-1.1.jar:./lib_managed/compile/commons-pool-1.5.4.jar:./lib_managed/compile/configgy-2.0.1.jar:./lib_managed/compile/json_2.8.0-2.1.4.jar:./lib_managed/compile/json_2.8.1-2.1.6.jar:./lib_managed/compile/libthrift-0.5.0.jar:./lib_managed/compile/log4j-1.2.14.jar:./lib_managed/compile/logkit-1.0.1.jar:./lib_managed/compile/netty-3.2.3.Final.jar:./lib_managed/compile/ostrich-4.0.1.jar:./lib_managed/compile/scala-compiler-2.8.1.jar:./lib_managed/compile/scala-library-2.8.1.jar:./lib_managed/compile/servlet-api-2.3.jar:./lib_managed/compile/slf4j-api-1.5.8.jar:./lib_managed/compile/slf4j-log4j12-1.5.8.jar:./lib_managed/compile/slf4j-nop-1.5.8.jar:./lib_managed/compile/specs_2.8.0-1.6.5.jar:./lib_managed/compile/util-core-1.8.1.jar:./lib_managed/compile/util-eval-1.8.1.jar:./lib_managed/compile/util-logging-1.8.1.jar:./lib_managed/compile/zookeeper-client-2.0.0.jar:./config/ com.twitter.service.snowflake.SnowflakeServer
running test script
[liang@api: snowflake]$ RUBYLIB=./target/gen-rb ./src/scripts/client_test.rb 10 "localhost:7609" test
"localhost"
"7609"
68160518203899904 test 0
68160518203899905 test 0
68160518208094208 test 0
68160518208094209 test 0
68160518212288512 test 0
68160518212288513 test 0
68160518216482816 test 0
68160518216482817 test 0
68160518216482818 test 0
68160518220677120 test 0
so far so good. but, if I run "src/scripts/json_stats_fetcher.rb" now, the server will crashes.
[liang@api: snowflake]$ RUBYLIB=./target/gen-rb ./src/scripts/json_stats_fetcher.rb
...
...seems the script hangs
the server will output OOM Error and just hangs there:
ERROR [Thread-1] (TNonblockingServer.java312) - run() exiting due to uncaught error
java.lang.OutOfMemoryError: Java heap space
...
...
soon I found if I send any malform data to the server, it will also crash the server.
[liang@api: snowflake]$ echo "hello"|nc localhost 7609
one more detail: after I compiled thrift 0.5.0, I run "make install" to install it, but I got this error:
Making install in test
make[4]: Nothing to be done for `install-exec-am'.
make[4]: Nothing to be done for `install-data-am'.
Making install in java
/usr/bin/ant
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/tools/ant/launch/Launcher
Caused by: java.lang.ClassNotFoundException: org.apache.tools.ant.launch.Launcher
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:319)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:330)
at java.lang.ClassLoader.loadClass(ClassLoader.java:254)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:399)
make[2]: *** [all-local] Error 1
make[1]: *** [install-recursive] Error 1
make: *** [install-recursive] Error 1
so I cd to the java directory and run "ant install", this time everything goes well:
[liang@api: thrift-0.5.0]$ cd lib/java/
[liang@api: java]$ ant install
Buildfile: /Users/liang/Downloads/thrift-0.5.0/lib/java/build.xml
init:
ivy-init-dirs:
...
...
[javadoc] Generating /Users/liang/Downloads/thrift-0.5.0/lib/java/build/javadoc/stylesheet.css...
[javadoc] 10 warnings
install:
[copy] Copying 124 files to /Users/liang/Downloads/thrift-0.5.0/lib/java/${install.javadoc.path}
BUILD SUCCESSFUL
Total time: 4 seconds
Sorry about close this issue by accident...
json_stats_fetching isn't working because its trying to use the wrong port. We no longer have use for that script so I'm going to get rid of it.
As for building thrift, I can't really help you with that. Ask the thrift developers for help.
And about the crash from the garbage data, I still can't reproduce that.
I found that if I telnet to the snowflake server and send some malform data, the server will crash. I think that's unacceptable to me. Think about this: If one client not works as expected and send an invalid request for an unknown reason, then the server crashes, and all the clients will stop working, that will be a disaster.
Sending a malform request:
This is the crash message:
My machine: