Scylla JMX proxy
GNU Affero General Public License v3.0
ccm fails to start a node because connection to jmx cannot be established (JMX crashes with SIGSEGV) #194

gleb-cloudius commented 1 year ago

Seen in CI:

self = <cdc_test.TestCdc object at 0x7f9ab9f9d0d0>
request = <FixtureRequest for <Function test_change_field_type_with_cdc[Single_cluster]>>
cluster_config = ClusterConfig(size=[3], replication="{'class': 'SimpleStrategy', 'replication_factor': 3}")

    def test_change_field_type_with_cdc(self, request, cluster_config):
>       self.schema_change_template(request, "ALTER TABLE ALTER b TYPE blob",
                                    cluster_size=cluster_config.size, replication=cluster_config.replication) 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ in schema_change_template
    self.populate_sequentially(n=cluster_size) in populate_sequentially
    node.start(wait_for_binary_proto=True, wait_other_notice=wait_other_notice)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <ccmlib.scylla_node.ScyllaNode object at 0x7f9ae4466410>
join_ring = True, no_wait = False, verbose = False, update_pid = True
wait_other_notice = False, replace_token = None, replace_address = None
replace_node_host_id = None
jvm_args = ['--api-address', '', '--collectd-hostname', '05767ff5161b.node2']
wait_for_binary_proto = True, profile_options = None, use_jna = False
quiet_start = False

    def start(self, join_ring=True, no_wait=False, verbose=False,
              update_pid=True, wait_other_notice=None, replace_token=None,
              replace_address=None, replace_node_host_id=None, jvm_args=None, wait_for_binary_proto=None,
              profile_options=None, use_jna=False, quiet_start=False):
        Start the node. Options includes:
          - join_ring: if false, start the node with -Dcassandra.join_ring=False
          - no_wait: by default, this method returns when the node is started
            and listening to clients.
            If no_wait=True, the method returns sooner.
          - wait_other_notice: if True, this method returns only when all other
            live node of the cluster
            have marked this node UP.
          - replace_token: start the node with the -Dcassandra.replace_token
          - replace_node_host_id: start the node with the
            --replace-node-first-boot option to replace a given node
            identified by its host_id.
          - replace_address: start the node with the deprecated
            --replace-address option.

        Extra command line options may be passed using the
        SCYLLA_EXT_OPTS environment variable.

        Extra environment variables for running scylla can be passed using the
        SCYLLA_EXT_ENV environment variable.
        Those are represented in a single string comprised of one or more
        pairs of "var=value" separated by either space or semicolon (';')
        if wait_for_binary_proto is None:
            wait_for_binary_proto = self.cluster.force_wait_for_cluster_start and not no_wait
        if wait_other_notice is None:
            wait_other_notice = self.cluster.force_wait_for_cluster_start and not no_wait
        if jvm_args is None:
            jvm_args = []

        scylla_cassandra_mapping = {'-Dcassandra.replace_address_first_boot':
        # Replace args in the form
        # [''] to ['', 'bar']
        translated_args = []
        new_jvm_args = []
        for jvm_arg in jvm_args:
            if '=' in jvm_arg:
                split_option = jvm_arg.split("=")
                e_msg = ("Option %s not in the form ''. "
                         "Please check your test" % jvm_arg)
                assert len(split_option) == 2, e_msg
                option, value = split_option
                # If we have information on how to translate the jvm option,
                # translate it
                if option in scylla_cassandra_mapping:
                    translated_args += [scylla_cassandra_mapping[option],
                # Otherwise, just pass it as is
        jvm_args = new_jvm_args

        if self.is_running():
            raise NodeError("%s is already running" %

        if not self.is_docker():
            for itf in list(self.network_interfaces.values()):
                if itf is not None and replace_address is None:
                    except Exception as msg:
                        print("{}. Looking for offending processes...".format(msg))
                        for proc in psutil.process_iter():
                            if any(self.cluster.ipprefix in cmd for cmd in proc.cmdline()):
                                print("name={} pid={} cmdline={}".format(,, proc.cmdline()))
                        raise msg

        marks = []
        if wait_other_notice:
            marks = [(node, node.mark_log()) for node in
                     list(self.cluster.nodes.values()) if node.is_live()]

        self.mark = self.mark_log()

        launch_bin = common.join_bin(self.get_path(), BIN_DIR, 'scylla')
        options_file = os.path.join(self.get_path(), 'conf', 'scylla.yaml')

        # TODO: we do not support forcing specific settings
        # TODO: workaround for api-address as we do not load it
        # from config file scylla#59
        conf_file = os.path.join(self.get_conf_dir(), common.SCYLLA_CONF)
        with open(conf_file, 'r') as f:
            data = yaml.safe_load(f)
        jvm_args = jvm_args + ['--api-address', data['api_address']]
        jvm_args = jvm_args + ['--collectd-hostname',
                               '%s.%s' % (socket.gethostname(),]

        # Let's add jvm_args and the translated args

        args = [launch_bin, '--options-file', options_file, '--log-to-stdout', '1'] + jvm_args + translated_args

        # Lets search for default overrides in SCYLLA_EXT_OPTS
        scylla_ext_opts = os.getenv('SCYLLA_EXT_OPTS', "").split()
        opts_i = 0
        orig_args = list(args)
        while opts_i < len(scylla_ext_opts):
            if scylla_ext_opts[opts_i].startswith("--scylla-manager="):
               opts_i += 1
            elif scylla_ext_opts[opts_i].startswith('-'):
                o = scylla_ext_opts[opts_i]
                opts_i += 1
                if '=' in o:
                    opt = o.replace('=', ' ', 1).split()
                    opt = [ o ]
                    while opts_i < len(scylla_ext_opts) and not scylla_ext_opts[opts_i].startswith('-'):
                        opts_i += 1
                if opt[0] not in orig_args:

        if '--developer-mode' not in args:
            args += ['--developer-mode', 'true']
        if '--smp' not in args:
            # If --smp is not passed from cmdline, use default (--smp 1)
            args += ['--smp', str(self._smp)]
        elif self._smp_set_during_test:
            # If node.set_smp() is called during the test, ignore the --smp
            # passed from the cmdline.
            args[args.index('--smp') + 1] = str(self._smp)
            # Update self._smp based on command line parameter.
            # It may be used below, along with self._mem_mb_per_cpu, for calculating --memory
            self._smp = int(args[args.index('--smp') + 1])
        if '--memory' not in args:
            # If --memory is not passed from cmdline, use default (512M per cpu)
            args += ['--memory', '{}M'.format(self._mem_mb_per_cpu * self._smp)]
        elif self._mem_set_during_test:
            # If node.set_mem_mb_per_cpu() is called during the test, ignore the --memory
            # passed from the cmdline.
            args[args.index('--memory') + 1] = '{}M'.format(self._mem_mb_per_cpu * self._smp)
        self._memory = self.parse_size(args[args.index('--memory') + 1])
        if '--default-log-level' not in args:
            args += ['--default-log-level', self.__global_log_level]
        if self.scylla_mode() == 'debug' and '--blocked-reactor-notify-ms' not in args:
            args += ['--blocked-reactor-notify-ms', '5000']
        # TODO add support for classes_log_level
        if '--collectd' not in args:
            args += ['--collectd', '0']
        if '--cpuset' not in args:
            args += ['--overprovisioned']
        if '--prometheus-address' not in args:
            args += ['--prometheus-address', data['api_address']]
        if replace_node_host_id:
            assert replace_address is None, "replace_node_host_id and replace_address cannot be specified together"
            args += ['--replace-node-first-boot', replace_node_host_id]
        elif replace_address:
            args += ['--replace-address', replace_address]
        args += ['--unsafe-bypass-fsync', '1']

        current_node_version = self.node_install_dir_version() or self.cluster.version()
        current_node_is_enterprise = parse_version(current_node_version) > parse_version("2018.1")

        # The '--kernel-page-cache' was introduced by
        # from 4.5 version
        # and 2022.1 Enterprise version
        kernel_page_cache_supported = not current_node_is_enterprise and parse_version(current_node_version) >= parse_version('')
        kernel_page_cache_supported |= current_node_is_enterprise and parse_version(current_node_version) >= parse_version('')
        if kernel_page_cache_supported and '--kernel-page-cache' not in args:
            args += ['--kernel-page-cache', '1']
        commitlog_o_dsync_supported = (
            (not current_node_is_enterprise and parse_version(current_node_version) >= parse_version('3.2'))
            or (current_node_is_enterprise and parse_version(current_node_version) >= parse_version('2020.1'))
        if commitlog_o_dsync_supported:
            args += ['--commitlog-use-o-dsync', '0']

        # The '--max-networking-io-control-blocks' was introduced by
        # from 4.6 version
        # and 2022.1 Enterprise version
        max_networking_io_control_blocks_supported = not current_node_is_enterprise and parse_version(current_node_version) >= parse_version('4.6')
        max_networking_io_control_blocks_supported |= current_node_is_enterprise and parse_version(current_node_version) >= parse_version('')
        if max_networking_io_control_blocks_supported and '--max-networking-io-control-blocks' not in args:
            args += ['--max-networking-io-control-blocks', '1000']

        ext_env = {}
        scylla_ext_env = os.getenv('SCYLLA_EXT_ENV', "").strip()
        if scylla_ext_env:
            scylla_ext_env = re.split(r'[; ]', scylla_ext_env)
            for s in scylla_ext_env:
                    [k, v] = s.split('=', 1)
                except ValueError as e:
                    print("Bad SCYLLA_EXT_ENV variable: {}: {}", s, e)
                    ext_env[k] = v

        message = "Starting scylla: args={} wait_other_notice={} wait_for_binary_proto={}".format(args, wait_other_notice, wait_for_binary_proto)

        scylla_process = self._start_scylla(args, marks, update_pid,

        ip_addr, _ = self.network_interfaces['storage']
        jmx_port = int(self.jmx_port)
        if not self._wait_java_up(ip_addr, jmx_port):
            e_msg = "Error starting node {}: unable to connect to scylla-jmx port {}:{}".format(
           , ip_addr, jmx_port)
>           raise NodeError(e_msg, scylla_process)
E           ccmlib.node.NodeError: Error starting node node2: unable to connect to scylla-jmx port

../scylla/.local/lib/python3.11/site-packages/ccmlib/ NodeError
mykaul commented 1 year ago

On node 2, we can see the following log:

Starting scylla-jmx: args=['/jenkins/workspace/releng/Scylla-CI/scylla/.dtest/dtest-576y7cmy/test/node2/bin/symlinks/scylla-jmx', '-Dapiaddress=', '', '-Djava.rmi.server.hostname=', '', '', '', '', '', '-Xmx256m', '-XX:+UseSerialGC', '', '', '-jar', '/jenkins/workspace/releng/Scylla-CI/scylla/.dtest/dtest-576y7cmy/test/node2/bin/scylla-jmx-1.0.jar']
# A fatal error has been detected by the Java Runtime Environment:
#  SIGSEGV (0xb) at pc=0x00007f5b19fc92a0, pid=43287, tid=0x00007f5b0417d6c0
# JRE version: OpenJDK Runtime Environment (8.0_352-b08) (build 1.8.0_352-b08)
# Java VM: OpenJDK 64-Bit Server VM (25.352-b08 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  []  StubCodeDesc::desc_for(unsigned char*)+0x10
# Core dump written. Default location: /jenkins/workspace/releng/Scylla-CI/scylla-dtest/core or core.43287
# An error report file with more information is saved as:
# /jenkins/workspace/releng/Scylla-CI/scylla-dtest/hs_err_pid43287.log
# Compiler replay data is saved as:
# /jenkins/workspace/releng/Scylla-CI/scylla-dtest/replay_pid43287.log
mykaul commented 1 year ago

@yaronkaikov - is there a way we can still get the logs and/or the coredump?

yaronkaikov commented 1 year ago

We have the logs

mykaul commented 1 year ago

/jenkins/workspace/releng/Scylla-CI/scylla-dtest/core or core.43287


nyh commented 1 year ago

@mykaul are you planning to debug the JRE? :-) Maybe this is, fixed in the JRE 9, while we are using the antediluvian JRE 8? BTW even JRE 8 has newer versions, maybe we can update?

mykaul commented 1 year ago

@mykaul are you planning to debug the JRE? :-) Maybe this is, fixed in the JRE 9, while we are using the antediluvian JRE 8? BTW even JRE 8 has newer versions, maybe we can update?

I was hoping to find something, yes. There is more than one option:

  1. Newer, albeit not in Ubuntu, OpenJDK 8 (
  2. We can and probably should at some point use a substantially more modern Java... I don't remember even where it's mentioned, but I have it somewhere in the roadmap
  3. Replace this component with a re-write , in a different lang...
nyh commented 1 year ago

I don't know if newer OpenJDK 8 would have this bug fixed. In fact I doubt it (see release notes in OpenJDK 8 was released 9 years ago (I have a child younger than that ;-)), and support for it ended officially a year ago (they are still doing security patches, but ONLY that).

As I noted in another thread, Cassandra already works correctly on OpenJDK 11 (see test/cql-pytest/run-cassandra for some configuration hacks needed to get it to run properly) so I'm pretty sure that JMX should run there as well. OpenJDK 11 will also reach end of life later this year (!), so hopefully Cassandra will get their act together by then (Cassandra can't run on recent OpenJDK), but even if not, I assume it will be much easier to get just JMX to run than the full Cassandra.

denesb commented 1 year ago

3. Replace this component with a re-write , in a different lang...

We only need to rewrite nodetool in a different lang, for which there were several proposals over the years. That is the only user of JMX (well some users might use it directly, inheriting their dependence on it from C*).

gleb-cloudius commented 1 year ago

May be we can workaround it in ccm? If jmx start fails try one more time.

bhalevy commented 1 year ago

Cc @tchaikov

bhalevy commented 1 year ago

@fruch is scylla-jmx crashing in related to this issue?

fruch commented 1 year ago

@fruch is scylla-jmx crashing in related to this issue?

Yes seems the same crash

tchaikov commented 1 year ago

Cc @tchaikov

it seems this is different from the one i am trying fix at #193. the crash came from JRE 8.0.

bhalevy commented 1 year ago

Seen again in

Starting scylla-jmx: args=['/jenkins/workspace/scylla-master/dtest-daily-release/scylla/.dtest/dtest-2pzsnl3j/test/node1/bin/symlinks/scylla-jmx', '-Dapiaddress=', '', '-Djava.rmi.server.hostname=', '', '', '', '', '', '-Xmx256m', '-XX:+UseSerialGC', '', '', '-jar', '/jenkins/workspace/scylla-master/dtest-daily-release/scylla/.dtest/dtest-2pzsnl3j/test/node1/bin/scylla-jmx-1.0.jar']
# A fatal error has been detected by the Java Runtime Environment:
#  SIGSEGV (0xb) at pc=0x00007f0d9bd9d2a0, pid=255567, tid=0x00007f0d98f7d6c0
# JRE version: OpenJDK Runtime Environment (8.0_352-b08) (build 1.8.0_352-b08)
# Java VM: OpenJDK 64-Bit Server VM (25.352-b08 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  []  StubCodeDesc::desc_for(unsigned char*)+0x10
# Core dump written. Default location: /jenkins/workspace/scylla-master/dtest-daily-release/scylla-dtest/core or core.255567
# An error report file with more information is saved as:
# /jenkins/workspace/scylla-master/dtest-daily-release/scylla-dtest/hs_err_pid255567.log
# Compiler replay data is saved as:
# /jenkins/workspace/scylla-master/dtest-daily-release/scylla-dtest/replay_pid255567.log
# If you would like to submit a bug report, please visit:
bhalevy commented 1 year ago

@yaronkaikov can we collect cores and crash logs like mentioned above from the split workers?

yaronkaikov commented 1 year ago

@bhalevy What do you want us to upload?

bhalevy commented 1 year ago

The core dump to start with, from

# Core dump written. Default location: /jenkins/workspace/scylla-master/dtest-daily-release/scylla-dtest/core or core.255567

And those logs, if possible:

# An error report file with more information is saved as:
# /jenkins/workspace/scylla-master/dtest-daily-release/scylla-dtest/hs_err_pid255567.log
# Compiler replay data is saved as:
# /jenkins/workspace/scylla-master/dtest-daily-release/scylla-dtest/replay_pid255567.log
fruch commented 1 year ago

The core dump to start with, from

# Core dump written. Default location: /jenkins/workspace/scylla-master/dtest-daily-release/scylla-dtest/core or core.255567

And those logs, if possible:

# An error report file with more information is saved as:
# /jenkins/workspace/scylla-master/dtest-daily-release/scylla-dtest/hs_err_pid255567.log
# Compiler replay data is saved as:
# /jenkins/workspace/scylla-master/dtest-daily-release/scylla-dtest/replay_pid255567.log

@bhalevy, I think this might take care of it:

bhalevy commented 1 year ago


bhalevy commented 1 year ago

We now have the coredump and vm log in

fruch commented 1 year ago

We now have the coredump and vm log in

looks like just the logs, no coredump file. I'm guessing it might need some time before it's available. Anyhow I hope that would be enough to figure it out...

tchaikov commented 1 year ago

updated #193 in hope to alleviate this pain.

bhalevy commented 1 year ago

Still happens with scylladb/scylla-jmx@48e16998d92965efe9e7e311e5ad15de6bfdb497

instanceKlass java/nio/LongBuffer
instanceKlass java/nio/CharBuffer
instanceKlass java/nio/ByteBuffer
ciInstanceKlass java/nio/Buffer 1 1 122 100 10 9 9 100 100 10 8 10 10 10 10 9 10 10 8 8 8 9 10 100 10 100 10 100 10 100 10 7 7 1 1 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 12 12 12 1 1 1 12 12 12 12 12 12 12 1 1 1 12 1 1 1 1 1 1 1 1 1 1 1 1
ciInstanceKlass java/lang/Boolean 1 1 124 10 9 10 10 8 10 9 9 8 10 7 10 10 100 100 10 10 8 10 9 7 100 100 1 1 1 1 1 1 1 1 1 1 1 1 5 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 100 1 1 1 100 1 1 1 1 1 1 1 1 1 1 1 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 12 12 12 12 1 7 12 12 12 1 12 1 12 7 12 1 1 12 12 1 7 12 12 1 1 1 1 1 1 1 1 1 1 1 1
staticfield java/lang/Boolean TRUE Ljava/lang/Boolean; java/lang/Boolean
staticfield java/lang/Boolean FALSE Ljava/lang/Boolean; java/lang/Boolean
staticfield java/lang/Boolean TYPE Ljava/lang/Class; java/lang/Class
ciInstanceKlass java/lang/Character 1 1 498 7 100 10 9 9 10 10 10 10 3 3 3 3 3 10 10 3 11 11 10 10 100 10 10 3 10 10 10 100 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 5 0 10 10 10 10 10 10 10 10 10 10 9 100 10 10 10 3 10 10 100 10 10 10 10 8 10 9 10 10 10 10 8 10 9 100 100 100 100 1 1 100 1 100 1 100 1 1 1 1 3 1 3 1 1 3 1 3 1 1 1 1 1 1 1 3 1 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1 1 1 3 1 1 3 1 1 1 1 1 3 1 1 1 5 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 100 100 1 1 1 1 1 1 1 1 12 12 12 12 12 12 100 12 12 12 100 12 12 12 12 1 12 12 12 12 1 12 12 12 12 12 12 7 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 1 12 12 100 12 12 1 12 12 12 1 100 12 100 12 12 12 7 12 1 12 12 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
staticfield java/lang/Character TYPE Ljava/lang/Class; java/lang/Class
staticfield java/lang/Character $assertionsDisabled Z 1
instanceKlass java/util/concurrent/atomic/AtomicLong
instanceKlass java/util/concurrent/atomic/AtomicInteger
instanceKlass java/lang/Long
instanceKlass java/lang/Integer
instanceKlass java/lang/Short
instanceKlass java/lang/Byte
instanceKlass java/lang/Double
instanceKlass java/lang/Float
ciInstanceKlass java/lang/Number 1 1 37 10 10 100 7 100 1 1 1 5 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 12 12 1 1 1
ciInstanceKlass java/lang/Float 1 1 193 7 100 10 10 100 4 100 10 10 8 8 10 10 10 10 4 4 4 10 9 10 10 10 10 10 10 3 3 3 10 10 10 10 8 10 9 100 100 1 1 1 1 1 4 1 1 1 4 1 1 3 1 3 1 3 1 3 1 1 1 1 1 1 1 5 0 1 1 1 1 1 1 1 1 1 1 1 1 1 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 100 12 100 12 1 1 12 100 12 1 1 100 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 1 7 12 12 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
staticfield java/lang/Float TYPE Ljava/lang/Class; java/lang/Class
ciInstanceKlass java/lang/Double 1 1 253 7 100 10 10 10 100 10 10 6 0 8 10 8 10 8 100 6 0 10 5 0 5 0 8 8 10 10 8 10 8 8 8 10 10 10 10 10 10 10 10 6 0 6 0 6 0 10 9 10 10 10 10 5 0 5 0 10 10 10 10 8 10 9 100 100 1 1 1 1 1 6 0 1 1 1 6 0 1 1 3 1 3 1 3 1 3 1 1 1 1 1 1 1 5 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 100 100 1 1 1 1 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 100 12 12 12 1 12 100 12 1 12 1 12 1 1 12 1 1 100 12 100 12 1 12 1 1 1 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 1 7 12 12 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
staticfield java/lang/Double TYPE Ljava/lang/Class; java/lang/Class
ciInstanceKlass java/lang/Byte 1 1 168 7 10 9 10 100 100 10 8 10 8 10 10 10 10 10 10 10 10 8 8 10 9 10 10 10 10 5 0 10 8 10 9 100 100 100 1 1 1 1 1 3 1 3 1 1 1 1 1 1 1 3 1 3 1 1 5 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 100 12 12 12 1 1 12 1 12 1 12 12 12 12 12 12 12 12 1 1 12 12 12 12 12 12 1 7 12 12 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
staticfield java/lang/Byte TYPE Ljava/lang/Class; java/lang/Class
ciInstanceKlass java/lang/Short 1 1 176 7 100 10 10 100 100 10 8 10 8 10 10 10 10 10 10 9 10 10 10 8 8 10 9 10 10 10 10 3 3 5 0 10 8 10 9 100 100 100 1 1 1 1 1 3 1 3 1 1 1 1 1 1 1 3 1 3 1 1 5 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 100 12 12 1 1 12 1 12 1 12 12 12 12 12 12 12 12 12 12 1 1 12 12 12 12 12 12 1 7 12 12 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
staticfield java/lang/Short TYPE Ljava/lang/Class; java/lang/Class
ciInstanceKlass java/lang/Integer 1 1 359 7 100 7 10 9 7 10 10 10 10 10 10 10 10 3 8 10 10 10 3 9 9 3 9 100 8 10 100 10 8 10 10 8 10 8 10 3 10 10 10 10 8 100 10 10 5 0 8 10 10 7 9 9 10 10 9 10 10 10 10 100 100 10 8 8 10 8 8 8 8 8 8 10 10 10 5 0 3 3 3 3 3 10 10 8 10 9 3 3 3 3 3 3 7 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1 3 1 1 5 0 1 1 1 1 1 1 1 1 1 1 1 1 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 100 1 1 100 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 12 12 1 12 12 100 12 12 12 100 12 12 12 1 12 12 12 12 12 12 1 1 12 1 12 1 12 12 1 12 1 12 12 12 12 12 1 1 12 12 1 12 12 1 12 12 12 12 12 12 12 7 12 1 1 12 1 1 12 1 1 1 1 1 1 12 12 12 12 12 1 7 12 12 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
staticfield java/lang/Integer TYPE Ljava/lang/Class; java/lang/Class
staticfield java/lang/Integer digits [C 36
staticfield java/lang/Integer DigitTens [C 100
staticfield java/lang/Integer DigitOnes [C 100
staticfield java/lang/Integer sizeTable [I 10
ciInstanceKlass java/lang/Long 1 1 415 7 100 100 10 9 100 10 10 10 10 10 5 0 5 0 100 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 5 0 8 10 10 10 100 5 0 5 0 9 9 3 3 100 8 10 8 10 8 8 10 5 0 10 10 10 10 8 100 10 10 8 10 8 10 10 5 0 5 0 9 10 8 8 10 8 8 8 8 8 8 10 10 10 10 9 10 10 10 100 100 10 10 10 10 10 5 0 5 0 5 0 5 0 5 0 10 10 10 8 10 9 100 100 100 1 1 1 1 1 1 5 0 1 1 1 1 1 1 1 3 1 3 1 5 0 1 1 1 1 1 1 1 1 1 1 1 1 1 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 100 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 12 12 1 12 12 12 12 12 1 12 12 12 12 12 12 100 12 12 12 12 12 12 100 12 12 12 1 12 12 12 1 12 12 1 1 12 1 12 1 1 12 12 12 12 12 1 1 12 12 1 12 1 12 12 12 12 1 1 12 1 1 1 1 1 1 12 12 12 12 12 12 100 12 1 1 12 12 12 12 12 12 12 1 7 12 12 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
staticfield java/lang/Long TYPE Ljava/lang/Class; java/lang/Class
ciInstanceKlass java/lang/NullPointerException 1 1 26 10 10 100 100 1 1 1 5 0 1 1 1 1 1 1 1 1 1 1 1 1 12 12 1 1
ciInstanceKlass java/lang/ArithmeticException 1 1 26 10 10 100 100 1 1 1 5 0 1 1 1 1 1 1 1 1 1 1 1 1 12 12 1 1
ciMethod java/lang/String hashCode ()I 2705 32769 393 0 -1
ciMethodData java/lang/String hashCode ()I 1 6429 orig 264 136 116 107 21 29 127 0 0 56 94 128 0 29 127 0 0 152 1 0 0 32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 77 68 79 32 101 120 116 114 97 32 100 97 116 97 32 108 111 99 107 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 16 0 0 185 1 0 0 233 72 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 120 0 0 0 255 255 255 255 7 0 6 0 0 0 0 0 data 15 0x60007 0x11 0x78 0x26 0xe0007 0x0 0x58 0x26 0x1e0007 0x26 0x38 0x902 0x2d0003 0x902 0xffffffffffffffe0 oops 0
compile java/lang/String hashCode ()I -1 3
mykaul commented 1 year ago

JRE version: OpenJDK Runtime Environment (8.0_352-b08) (build 1.8.0_352-b08)

Java VM: OpenJDK 64-Bit Server VM (25.352-b08 mixed mode linux-amd64 compressed oops)

For some reason it still uses the older JDK/JRE.

mykaul commented 1 year ago perhaps is needed?

fruch commented 1 year ago

scylladb/scylla-cluster-tests#5994 perhaps is needed?

dtest doesn't use anything from SCT.

dtest is selecting the java 1.8 as default in it's docker image, I think we'll need to rebuild it to switch of latest jvm

mykaul commented 1 year ago

Not a ccm issue?

fruch commented 1 year ago

Not a ccm issue?

Never was a ccm issue, seem like it's an issue with scylla-jmx using old jvm by default, and we were trying to move to newer one, but changes in scylla-jmx are not enough since the the used java is controlled by the environment, and it's not different between different test frameworks. (dtest, SCT, core unittests)

mykaul commented 1 year ago

Not a ccm issue?

Never was a ccm issue, seem like it's an issue with scylla-jmx using old jvm by default, and we were trying to move to newer one, but changes in scylla-jmx are not enough since the the used java is controlled by the environment, and it's not different between different test frameworks. (dtest, SCT, core unittests)

Thanks. I saw and thought it was related. BTW, in sane OS, we should use 'alternatives' to determine which Java binary we wish to use. Regardless, we need to update the toolchains... Do we have open issues for them?

tchaikov commented 1 year ago

Not a ccm issue?

Never was a ccm issue, seem like it's an issue with scylla-jmx using old jvm by default, and we were trying to move to newer one, but changes in scylla-jmx are not enough since the the used java is controlled by the environment, and it's not different between different test frameworks. (dtest, SCT, core unittests)

Thanks. I saw scylladb/scylla-ccm@8af126d and thought it was related. BTW, in sane OS, we should use 'alternatives' to determine which Java binary we wish to use.

i doubt this. alternatives is a user-facing facility. so user can use, for instance neovim instead vim when it he/she runs vim, or use jre-8 instead jre-11 when running java. but when it comes to the behavior of a package, which explicitly depends only on jre-11-headless, the JRE it uses should be the one it chooses, not the one favored by user -- what if user prefers staying on the edge and installs jre-17 ? we should not break and blame user.

Regardless, we need to update the toolchains... Do we have open issues for them?

if Avi's assertion still holds, see , we don't need to update the dbuild docker image.

i will try to create a patch to teach scylla-jmx to use jre-11 even if it's installed from a relocatable package. my previous fix of 82810949183891682c5ec7f8dbc2f020fccc2d33 only works in the non-packaging mode.

fruch commented 1 year ago

Not a ccm issue?

Never was a ccm issue, seem like it's an issue with scylla-jmx using old jvm by default, and we were trying to move to newer one, but changes in scylla-jmx are not enough since the the used java is controlled by the environment, and it's not different between different test frameworks. (dtest, SCT, core unittests)

Thanks. I saw scylladb/scylla-ccm@8af126d and thought it was related. BTW, in sane OS, we should use 'alternatives' to determine which Java binary we wish to use. Regardless, we need to update the toolchains... Do we have open issues for them?

I've opened a PR to remove the change of java alternative that dtest docker image was doing

bhalevy commented 5 months ago

We're seeing something similar in 5.2.15 now. For example,

Starting scylla-jmx: args=['/jenkins/workspace/scylla-5.2/dtest-release/scylla/.dtest/dtest-wcqq6s0e/test/node2/bin/symlinks/scylla-jmx', '-Dapiaddress=', '', '-Djava.rmi.server.hostname=', '', '', '', '', '', '-Xmx256m', '-XX:+UseSerialGC', '', '', '-jar', '/jenkins/workspace/scylla-5.2/dtest-release/scylla/.dtest/dtest-wcqq6s0e/test/node2/bin/scylla-jmx-1.0.jar']
# A fatal error has been detected by the Java Runtime Environment:
#  SIGSEGV (0xb) at pc=0x00007fb31cae52a0, pid=3238, tid=0x00007fb319cc56c0
# JRE version: OpenJDK Runtime Environment (8.0_352-b08) (build 1.8.0_352-b08)
# Java VM: OpenJDK 64-Bit Server VM (25.352-b08 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  []  StubCodeDesc::desc_for(unsigned char*)+0x10
# Core dump written. Default location: /jenkins/workspace/scylla-5.2/dtest-release/scylla-dtest/core or core.3238
# An error report file with more information is saved as:
# /jenkins/workspace/scylla-5.2/dtest-release/scylla-dtest/hs_err_pid3238.log
# Compiler replay data is saved as:
# /jenkins/workspace/scylla-5.2/dtest-release/scylla-dtest/replay_pid3238.log
# If you would like to submit a bug report, please visit:
bhalevy commented 5 months ago

@denesb I'm not exactly sure what triggered the above, but we should probably backport the following pataches to 5.2:

fruch commented 5 months ago

@denesb I'm not exactly sure what triggered the above, but we should probably backport the following pataches to 5.2:

it's would drag lot of other changes across the board (both in dtest and SCT)

mykaul commented 5 months ago

We've updated the JDK version in latest 5.2/2023.1, hoping it'll solve that crash (which doesn't seem to be JMX fault...)

fruch commented 5 months ago

We've updated the JDK version in latest 5.2/2023.1, hoping it'll solve that crash (which doesn't seem to be JMX fault...)

when you said it's updated, what exactly changed, and where ?, cause one wouldn't need to update the docker image of dtest for that. (and I don't think that was done on those branches)

mykaul commented 5 months ago

We've updated the JDK version in latest 5.2/2023.1, hoping it'll solve that crash (which doesn't seem to be JMX fault...)

when you said it's updated, what exactly changed, and where ?, cause one wouldn't need to update the docker image of dtest for that. (and I don't think that was done on those branches)

build 1.8.0_352-b08 updated to build 1.8.0_392, or something like that. If you are running JMX in your own Docker image, then I expect you to update it as well (do you?)

fruch commented 5 months ago

We've updated the JDK version in latest 5.2/2023.1, hoping it'll solve that crash (which doesn't seem to be JMX fault...)

when you said it's updated, what exactly changed, and where ?, cause one wouldn't need to update the docker image of dtest for that. (and I don't think that was done on those branches)

build 1.8.0_352-b08 updated to build 1.8.0_392, or something like that. If you are running JMX in your own Docker image, then I expect you to update it as well (do you?)

we are not updating anything on the branch automatically on it's own, so if something like that is needed, one should raise an issue for dtest for doing it, other no one know about it.

mykaul commented 5 months ago

@fruch - how can you follow ? I'm not sure you have a diff even... (the request came from - and a similar issue exists for 5.2/2023.1.x)

bhalevy commented 5 months ago

In 5.2, it appears like the toolchain change is part of scylladb/scylladb@5a05ccc2f897871df6d877e1d3a52a2c0981cc7c

mykaul commented 5 months ago

In 5.2, it appears like the toolchain change is part of scylladb/scylladb@5a05ccc

That just adds pyudev. But we have an overall update of the toolchain regardless. (or we should have, if we haven't done it yet).

fruch commented 5 months ago

@fruch - how can you follow scylladb/scylladb@fcfcd6d ? I'm not sure you have a diff even... (the request came from scylladb/scylla-enterprise#3872 - and a similar issue exists for 5.2/2023.1.x)

how that change is related to an update of JVM ? regardless there not information there what is this change, and why it was done. so I'm not sure I understand that question and what exactly you think I should follow ? (and why ?)

mykaul commented 5 months ago

@fruch - how can you follow scylladb/scylladb@fcfcd6d ? I'm not sure you have a diff even... (the request came from scylladb/scylla-enterprise#3872 - and a similar issue exists for 5.2/2023.1.x)

how that change is related to an update of JVM ? regardless there not information there what is this change, and why it was done. so I'm not sure I understand that question and what exactly you think I should follow ? (and why ?) (and the corresponding 5.2 one) update the JVM to the latest one, that's all.

bhalevy commented 2 months ago

Apparently seen again in 5.2.18-0.20240419.dae9bef75f66:

Starting scylla-jmx: args=['/jenkins/workspace/scylla-5.2/dtest-release/scylla/.dtest/dtest-jur8l3zr/test/node1/bin/symlinks/scylla-jmx', '-Dapiaddress=', '', '-Djava.rmi.server.hostname=', '', '', '', '', '', '-Xmx256m', '-XX:+UseSerialGC', '', '', '-jar', '/jenkins/workspace/scylla-5.2/dtest-release/scylla/.dtest/dtest-jur8l3zr/test/node1/bin/scylla-jmx-1.0.jar']
# A fatal error has been detected by the Java Runtime Environment:
#  SIGSEGV (0xb) at pc=0x00007fba72e4c2a0, pid=96620, tid=0x00007fba5cfff6c0
# JRE version: OpenJDK Runtime Environment (8.0_352-b08) (build 1.8.0_352-b08)
# Java VM: OpenJDK 64-Bit Server VM (25.352-b08 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  []  StubCodeDesc::desc_for(unsigned char*)+0x10
# Core dump written. Default location: /jenkins/workspace/scylla-5.2/dtest-release/scylla-dtest/core or core.96620
# An error report file with more information is saved as:
# /jenkins/workspace/scylla-5.2/dtest-release/scylla-dtest/hs_err_pid96620.log
# Compiler replay data is saved as:
# /jenkins/workspace/scylla-5.2/dtest-release/scylla-dtest/replay_pid96620.log
# If you would like to submit a bug report, please visit:
mykaul commented 2 months ago

As I mentioned elsewhere, even FC37 has a newer build ( ) - I'm not sure why we are not updating to it.