onnxRuntimeException and DefaultLogger issues in AWS Lambda runtime

DoctorSlimm commented 1 year ago

Describe the issue

{
  "errorType": "Runtime.ExitError",
  "errorMessage": "RequestId: f0bd9017-c3d2-496a-b974-b183f837bad9 Error: Runtime exited with error: signal: aborted"
}

Error in cpuinfo: failed to parse the list of possible processors in /sys/devices/system/cpu/possible
Error in cpuinfo: failed to parse the list of present processors in /sys/devices/system/cpu/present
Error in cpuinfo: failed to parse both lists of possible and present processors
Error in cpuinfo: failed to parse the list of possible processors in /sys/devices/system/cpu/possible
Error in cpuinfo: failed to parse the list of present processors in /sys/devices/system/cpu/present
Error in cpuinfo: failed to parse both lists of possible and present processors
terminate called after throwing an instance of 'onnxruntime::OnnxRuntimeException'
what():  /onnxruntime_src/include/onnxruntime/core/common/logging/logging.h:294 static const onnxruntime::logging::Logger& onnxruntime::logging::LoggingManager::DefaultLogger() Attempt to use DefaultLogger but none has been registered.
Error in cpuinfo: failed to parse the list of possible processors in /sys/devices/system/cpu/possible
Error in cpuinfo: failed to parse the list of present processors in /sys/devices/system/cpu/present
Error in cpuinfo: failed to parse both lists of possible and present processors
Error in cpuinfo: failed to parse the list of possible processors in /sys/devices/system/cpu/possible
Error in cpuinfo: failed to parse the list of present processors in /sys/devices/system/cpu/present
Error in cpuinfo: failed to parse both lists of possible and present processors
terminate called after throwing an instance of 'onnxruntime::OnnxRuntimeException'
what():  /onnxruntime_src/include/onnxruntime/core/common/logging/logging.h:294 static const onnxruntime::logging::Logger& onnxruntime::logging::LoggingManager::DefaultLogger() Attempt to use DefaultLogger but none has been registered.
START RequestId: f0bd9017-c3d2-496a-b974-b183f837bad9 Version: $LATEST
RequestId: f0bd9017-c3d2-496a-b974-b183f837bad9 Error: Runtime exited with error: signal: aborted
Runtime.ExitError
END RequestId: f0bd9017-c3d2-496a-b974-b183f837bad9
REPORT RequestId: f0bd9017-c3d2-496a-b974-b183f837bad9  Duration: 8028.84 ms    Billed Duration: 8029 ms    Memory Size: 8000 MB    Max Memory Used: 1673 MB

To reproduce

######################
# Base Container Image
######################

FROM public.ecr.aws/lambda/python:3.9-arm64 AS model

COPY save_model.sh pods.json embedding.py models.py tokenizer.py ./

# Installing git lfs on Amazon Linux 2
RUN yum install -y git
RUN curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.rpm.sh | bash

# Installing tree
RUN yum install -y tree

# Make Cache Directories
RUN mkdir /cache

# Torch Cache (TORCH_CACHE)
RUN mkdir /cache/torch
ENV TORCH_CACHE=/cache/torch

# Sentence Transformers Cache (SENTENCE_TRANSFORMERS_HOME)
RUN mkdir /cache/torch/sentence_transformers
ENV SENTENCE_TRANSFORMERS_HOME=/cache/torch/sentence_transformers

# Transformers Cache (TRANSFORMERS_CACHE)
RUN mkdir /cache/huggingface
RUN mkdir /cache/huggingface/transformers
ENV TRANSFORMERS_CACHE=/cache/huggingface/transformers

# Verify Cache Directories
RUN tree /cache

# Install packages TODO (optimize this for correct versions using poetry, faster install)
RUN python3.9 -m pip install --no-cache-dir torch --extra-index-url https://download.pytorch.org/whl/cpu
RUN python3.9 -m pip install "transformers[torch]" "pinecone-client[grpc]" python-dotenv ccxt
RUN python3.9 -m pip install pillow tqdm numpy scikit-learn scipy nltk sentencepiece

# Installing Sentence Transformers, no dependencies (avoid installing torch GPU)
RUN python3.9 -m pip install --no-deps sentence-transformers

# TODO: Investigate Instructor
# RUN python3.9 -m pip install --no-deps InstructorEmbedding

# Installing Sentence Transformers Models
RUN bash save_model.sh deepset "deepset/all-mpnet-base-v2-table"
RUN bash save_model.sh sentence_transformers "all-MiniLM-L6-v2"

# Installing Instructor
# RUN bash save_model.sh instructor "hkunlp/instructor-base"

# Installing Vanilla Transformers Tokenizer
RUN bash save_model.sh huggingface "sentence-transformers/all-MiniLM-L6-v2"

# Set the production environment variable
ENV PRODUCTION=true

#######################
# Upload Function Image
#######################

FROM mozart_base:latest AS model

COPY .env app.py download.py OnnxEncoder.py extraction.py document_container.py mozart_page.py metadata_classes.py text_distribution.py fetch_data.py upload_dataset.py parse_pdf.py parse_tables.py parse_article.py ./

RUN yum update -y
RUN yum install -y wget

# Installing Upload Function packages
RUN python3.9 -m pip install --upgrade pip wheel
RUN python3.9 -m pip install langchain onnx onnxruntime datasets openai tenacity trafilatura sentry_sdk requests pymupdf beautifulsoup4

# Installing NLTK
RUN python3.9 -m nltk.downloader -d /var/lang/nltk_data punkt

# Set the production environment variable
ENV PRODUCTION=true

# Set AWS Region
ENV export AWS_REGION=your_aws_region

# Downloading other things
RUN python3.9 download.py

# Command can be overwritten by providing a different command in the template directly.
CMD ["app.lambda_handler"]

mport numpy as np
import torch
import transformers
from sentence_transformers import SentenceTransformer, models
from ccxt import Exchange as ccxt  # for the sexy utils

class OnnxEncoder:
    """OnxEncoder dedicated to run SentenceTransformer under OnnxRuntime."""

    def __init__(self, session, tokenizer, pooling, normalization):
        self.session = session
        self.tokenizer = tokenizer
        self.max_length = tokenizer.__dict__["model_max_length"]
        self.pooling = pooling
        self.normalization = normalization

    def encode(self, sentences: list):

        sentences = [sentences] if isinstance(sentences, str) else sentences

        inputs = {
            k: v.numpy()
            for k, v in self.tokenizer(
                sentences,
                padding=True,
                truncation=True,
                max_length=self.max_length,
                return_tensors="pt",
            ).items()
        }

        #
        # TODO: Investigate this,   (I have no idea what this is doing)
        #   why an Error is Thrown
        #   when segment_ids is not present (sentence-transformers/all-MiniLM-L6-v2)
        #
        if 'segment_ids' in [x.name for x in self.session.get_inputs()]:
            # inputs['segment_ids'] = np.zeros_like(inputs['input_ids'])
            inputs['segment_ids'] = inputs['token_type_ids']
            del inputs['token_type_ids']

        hidden_state = self.session.run(None, inputs)
        sentence_embedding = self.pooling.forward(
            features={
                "token_embeddings": torch.Tensor(hidden_state[0]),
                "attention_mask": torch.Tensor(inputs.get("attention_mask")),
            },
        )

        if self.normalization is not None:
            sentence_embedding = self.normalization.forward(features=sentence_embedding)

        sentence_embedding = sentence_embedding["sentence_embedding"]

        if sentence_embedding.shape[0] == 1:
            sentence_embedding = sentence_embedding[0]

        return sentence_embedding.numpy()

def sentence_transformers_onnx(
    model,
    path,
    do_lower_case=True,
    input_names=["input_ids", "attention_mask", "segment_ids"], # BUG: "token_type_ids" Aha!
    providers=["CPUExecutionProvider"],
):
    """OnxRuntime for sentence transformers.

    Parameters
    ----------
    model
        SentenceTransformer model.
    path
        Model file dedicated to session inference.
    do_lower_case
        Either or not the model is cased.
    input_names
        Fields needed by the Transformer.
    providers
        Either run the model on CPU or GPU: ["CPUExecutionProvider", "CUDAExecutionProvider"].

    """
    try:
        import onnxruntime as ort
    except:
        raise ValueError("You need to install onnxruntime.")

    model.save(path)

    configuration = transformers.AutoConfig.from_pretrained(
        path, from_tf=False, local_files_only=True
    )

    tokenizer = transformers.AutoTokenizer.from_pretrained(
        path, do_lower_case=do_lower_case, from_tf=False, local_files_only=True
    )

    encoder = transformers.AutoModel.from_pretrained(
        path, from_tf=False, config=configuration, local_files_only=True
    )

    st = ["cherche"]

    inputs = tokenizer(
        st,
        padding=True,
        truncation=True,
        max_length=tokenizer.__dict__["model_max_length"],
        return_tensors="pt",
    )

    model.eval()

    with torch.no_grad():

        symbolic_names = {0: "batch_size", 1: "max_seq_len"}

        torch.onnx.export(
            encoder,
            args=tuple(inputs.values()),
            f=f"{path}.onx",
            opset_version=13,  # ONX version needs to be >= 13 for sentence transformers.
            do_constant_folding=True,
            input_names=input_names,
            output_names=["start", "end"],
            dynamic_axes={
                "input_ids": symbolic_names,
                "attention_mask": symbolic_names,
                "segment_ids": symbolic_names,
                "start": symbolic_names,
                "end": symbolic_names,
            },
        )

        normalization = None
        for modules in model.modules():
            for idx, module in enumerate(modules):
                if idx == 1:
                    pooling = module
                if idx == 2:
                    normalization = module
            break

        return OnnxEncoder(
            session=ort.InferenceSession(f"{path}.onx", providers=providers),
            tokenizer=tokenizer,
            pooling=pooling,
            normalization=normalization,
        )

if __name__ == "__main__":
    model_name = "deepset/all-mpnet-base-v2-table"
    model = sentence_transformers_onnx(
        model=SentenceTransformer("deepset/all-mpnet-base-v2-table"),
        path="onnx_model",
    )
    e = model.encode(["I Love cheese", "I hate cheese"])
    print(e)

Urgency

urgent, piloting to early customers this week and have been up for 2 days

Platform

Linux

OS Version

AWS Lambda, Image built on MacOS

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

not sure

ONNX Runtime API

Python

Architecture

ARM64

Execution Provider

Other / Unknown

Execution Provider Library Version

(CPU Only... ARM64 I think?)

tianleiwu commented 1 year ago

DoctorSlimm commented 1 year ago

See related #10038

And how can I make it work without throwing

Error in cpuinfo: failed to parse both lists of possible and present processors
terminate called after throwing an instance of 'onnxruntime::OnnxRuntimeException'
what():  /onnxruntime_src/include/onnxruntime/core/common/logging/logging.h:294 static const onnxruntime::logging::Logger& onnxruntime::logging::LoggingManager::DefaultLogger() Attempt to use DefaultLogger but none has been registered.

tianleiwu commented 1 year ago

@DoctorSlimm, could you try the binary mentioned in https://github.com/microsoft/onnxruntime/issues/10038#issuecomment-1526156977? It is important to let us know whether it could fix the issue or bring new issue.

DoctorSlimm commented 1 year ago

@tianleiwu Hello, I have tried your solution and recieved the same error actually. I have also posted for you my source code for information and build commands etc - I think this issue could be down to the fact that I am building the image on a mac? do you think so? Should I try to build it in inside AWS CodeBuild in an aarch64 VM instead?

https://github.com/microsoft/onnxruntime/issues/10038#issuecomment-1528154009

tianleiwu commented 1 year ago

@DoctorSlimm, It seems that env is not initialized properly. I am not sure about root cause. If you are able to ssh to your VM, you can manually setup a python environment (with minimal packages) and test session creation with any onnx model.

skottmckay commented 1 year ago

@DoctorSlimm

I have tried your solution and recieved the same error actually.

I don't quite understand how that is possible if you were using the custom build, as the change in that build would only attempt to use the default logger if it existed.

i.e. this check static bool HasDefaultLogger() { return nullptr != s_default_logger_; }

is made first, but the error comes from the attempt to use the default logger

inline const Logger& LoggingManager::DefaultLogger() {
  if (s_default_logger_ == nullptr) {
    // fail early for attempted misuse. don't use logging macros as we have no logger.
    ORT_THROW("Attempt to use DefaultLogger but none has been registered.");
  }

For that to happen s_defaultlogger, a static member in the LoggingManager, would have to be null and not null at essentially the same time.

I don't believe that's possible, which would suggest your setup was not using the custom onnxruntime python package if the error is coming from the cpuinfo init.

Of course it could be a completely unrelated place that is attempting to use the default logger. We could setup a different custom build to test if that is the case.

ashokrajab commented 1 year ago

@DoctorSlimm,

RUN bash save_model.sh instructor "hkunlp/instructor-base"

Were you able to successfully convert Instructor model to onnx format?

microsoft / onnxruntime