ninia / jep

Embed Python in Java
Other
1.3k stars 147 forks source link

JMV crashes with 'insufficient memory' after upgrading from Jep 3.8.2 to 4.0.3 (and moving from Python 2.7 to 3.10) #430

Closed rorytorneymf closed 1 year ago

rorytorneymf commented 1 year ago

I was previously using the following version of Jep (3.8.2) and dependencies without issues, using openSUSE Leap15.4, which is running Python 2.7:

zypper -n refresh && \
zypper -n update && \
zypper -n install python-devel && \
zypper -n install python-pip && \
zypper -n install python-matplotlib && \
zypper -n install zlib-devel && \
zypper -n install python-numpy-devel && \
zypper -n install python-lxml && \
zypper -n install python-scipy && \
zypper -n install gcc-c++ && \
pip install regex==2014.12.24 && \
pip install talon==1.3.4 && \
pip install jep==3.8.2 && \
zypper -n clean --all

ENV LD_PRELOAD=/usr/lib64/python2.7/config/libpython2.7.so
ENV LD_LIBRARY_PATH=/usr/local/lib/python2.7/site-packages/talon:/usr/local/lib64/python2.7/site-packages/jep

Example of the script I am calling from Java:

split_email.py

import talon
from talon import quotations
from talon import signature
from talon.signature.bruteforce import extract_signature

import os
import resource

#Both environment variables are expressed in mb
soft = os.getenv('PYTHON_RLIMIT_DATA_SOFT', default='256')
hard = os.getenv('PYTHON_RLIMIT_DATA_HARD', default='512')

resource.setrlimit(resource.RLIMIT_DATA, (int(soft) * 1048576, int(hard) * 1048576))

# don't forget to init the library first
# it loads machine learning classifiers
talon.init()

return quotations.split_emails("""Reply

-----Original Message-----

Quote""")

And how I was calling this:

/**
 * A thread local instance of the Jep library. This is required to be thread local
 * as <a href="https://github.com/mrj0/jep/wiki/Performance-Considerations">
 * Jep will only execute calls on the thread it was instantiated on</a>
 * and <a href=" https://github.com/mrj0/jep/issues/28"> closing the Jep instance breaks the Numpy Python Library.</a>
 * Because of these two issues all Worker threads will call to a separate Jep thread.
 */
private static final ThreadLocal<Jep> threadLocal = new ThreadLocal<Jep>() {
    @Override
    protected Jep initialValue() {
        try {
            return new Jep(false, null, null, new ClassEnquirerImpl());
        } catch (JepException e) {
            throw new RuntimeException(e);
        }
    }

    @Override
    public void remove() {
        Jep jep = this.get();
        if (jep != null) {
            try {
                jep.close();
            } catch (JepException ex) {
                throw new RuntimeException(ex);
            }
        }
        super.remove();
    }
};

public static List<Integer> splitEmail(String message) throws JepException {
        List<Integer> emailStartLineNumbers = new ArrayList<>();
        Jep jep = threadLocal.get();
        if (jep == null) {
            return emailStartLineNumbers;
        }
        jep.eval("import split_email");
        jep.set("arg", message);
        jep.eval("x = split_email.splitEmail(arg)");
        Object lineMarkers = jep.getValue("x");
        jep.eval("del x");
        jep.eval("del arg");
        if (lineMarkers instanceof String) {
            char[] markers = ((String) lineMarkers).toCharArray();
            int size = markers.length;
            for (int i = 0; i < size; i++) {
                if (markers[i] == 's') {
                    emailStartLineNumbers.add(i);
                }
            }
        } else {
            throw new RuntimeException("Unexpected return type from Python when separating email messages.");
        }

        return emailStartLineNumbers;

    }

This all worked fine, however, I am now in the process of updating things to use Python 3.10 (openSUSE Tumbleweed), along with Jep 4.0.3. My updated dependencies look like this:

# Pinning version of scikit-learn to 1.0.1 to avoid this error: "Trying to unpickle estimator LinearSVC from version 1.0.1 when using version 1.1.2."
RUN zypper -n refresh && \
zypper -n update && \
zypper -n install python3-devel && \
zypper -n install python3-pip && \
zypper -n install python3-matplotlib && \
zypper -n install zlib-devel && \
zypper -n install python3-numpy-devel && \
zypper -n install python3-lxml && \
zypper -n install python3-scipy && \
zypper -n install gcc-c++ && \
pip install scikit-learn==1.0.1 && \
pip install regex==2022.6.2 && \
pip install -U https://github.com/mailgun/talon/archive/refs/tags/v1.6.0.zip  && \
pip install jep==4.0.3 && \
zypper -n clean --all

ENV LD_PRELOAD=/usr/lib64/python3.10/config-3.10-x86_64-linux-gnu/libpython3.10.so
ENV LD_LIBRARY_PATH=/usr/lib/python3.10/site-packages/talon:/usr/lib64/python3.10/site-packages/jep

Due the Jep now being an abstract class in 4.0.3, I also updated my Java code to use SubInterpreter rather than Jep:

private static final ThreadLocal<Jep> threadLocal = new ThreadLocal<Jep>() {
    @Override
    protected Jep initialValue() {
        try {
            final JepConfig jepConfig = new JepConfig();
            jepConfig.setIncludePath(null);
            jepConfig.setClassLoader(null);
            jepConfig.setClassEnquirer(new ClassEnquirerImpl());

            return new SubInterpreter(jepConfig);
        } catch (JepException e) {
            throw new RuntimeException(e);
        }
    }

    @Override
    public void remove() {
        Jep jep = this.get();
        if (jep != null) {
            try {
                jep.close();
            } catch (JepException ex) {
                throw new RuntimeException(ex);
            }
        }
        super.remove();
    }
};

However, when I run my Java code now, I am seeing the JVM crash with the following error. This appears to happen after the split_email.py has been called:

12:45:31.452 worker-markup-fs> INFO  [2022-10-11 11:45:31,452] com.github.cafdataprocessing.worker.markup.core.EmailSplitter: Starting email splitting based on document received
12:45:31.960 worker-markup-fs> /usr/lib/python3.10/site-packages/talon/signature/extraction.py:7: UserWarning: NumPy was imported from a Python sub-interpreter but NumPy does not properly support sub-interpreters. This will likely work for most users but might cause hard to track down issues or subtle bugs. A common user of the rare sub-interpreter feature is wsgi which also allows single-interpreter mode.
12:45:31.960 worker-markup-fs> Improvements in the case of bugs are welcome, but is not on the NumPy roadmap, and full support may require significant effort to achieve.
12:45:31.960 worker-markup-fs> import numpy
12:45:32.950 worker-markup-fs> INFO  [2022-10-11 11:45:32,949] com.github.cafdataprocessing.worker.markup.core.EmailSplitter: Email Splitting completed
12:45:32.950 worker-markup-fs> INFO  [2022-10-11 11:45:32,949] com.github.cafdataprocessing.worker.markup.core.MarkupHeadersAndBody: Starting markup of Headers and Body
12:45:32.956 worker-markup-fs> OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00007f6a94e9f000, 16384, 0) failed; error='Not enough space' (errno=12)
12:45:32.956 worker-markup-fs> [31.713s][warning][os,thread] Failed to start thread "Unknown thread" - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
12:45:32.956 worker-markup-fs> #
12:45:32.956 worker-markup-fs> # There is insufficient memory for the Java Runtime Environment to continue.
12:45:32.956 worker-markup-fs> # Native memory allocation (mmap) failed to map 16384 bytes for committing reserved memory.

Is there anything I'm doing wrong that could be causing this?

I did try using SharedInterpreter instead of SubInterpreter but saw the same error. Going by this:

https://github.com/ninia/jep/wiki/SharedInterpreter-vs-SubInterpreter#which-should-i-use

I think SubInterpreter is what I should be using, given how I used Jep instances previously.

I did note this warning in the log as well, but not sure if its related to the JVM crash or not:

/usr/lib/python3.10/site-packages/talon/signature/extraction.py:7: UserWarning: NumPy was imported from a Python sub-interpreter but NumPy does not properly support sub-interpreters. This will likely work for most users but might cause hard to track down issues or subtle bugs. A common user of the rare sub-interpreter feature is wsgi which also allows single-interpreter mode.

Many thanks

rorytorneymf commented 1 year ago

Closing this, the memory issue was caused by a local machine issue, apologies.