modularml / max

A collection of sample programs, notebooks, and tools which highlight the power of the MAX Platform
https://www.modular.com
Other
322 stars 41 forks source link

[BUG]: Receiving error LLCL 'Assertion `runtime && "no runtime is associated with the current thread"' failed' #127

Closed Sajtospoga01 closed 2 months ago

Sajtospoga01 commented 6 months ago

Bug description

After compiling the mojo code, I receive an error: 'Assertion `runtime && "no runtime is associated with the current thread"' failed' from: /__w/modular/modular/LLCL/include/LLCL/Runtime/Runtime.h:82

This error only happens if the project is compiled(with mojo build helloworld.mojo), and not when just in time executed.

Steps to reproduce

The simple code i head (from the docs site, and it crashes at creating the engine.inferencesession())

from python import Python

from max import engine
from pathlib import Path
from python import Python
from tensor import Tensor, TensorShape,TensorSpec
from algorithm import argmax

fn infer() raises:
    var model_path = "roberta"
    var batch = 1
    var seqlen = 128
    var input_ids_spec = TensorSpec(DType.int64, batch, seqlen)
    var attention_mask_spec = TensorSpec(DType.int64, batch, seqlen)
    print("loading options")
    var options = engine.LoadOptions()
    options.add_input_spec(input_ids_spec)
    options.add_input_spec(attention_mask_spec)

    print("creating inference session")
    var session = engine.InferenceSession()
    print("loading model")
    var model = session.load_model(model_path)

    print("loading tokanizer")
    var INPUT="There are many exciting developments in the field of AI Infrastructure!"
    var HF_MODEL_NAME = "cardiffnlp/twitter-roberta-base-emotion-multilabel-latest"
    var transformers = Python.import_module("transformers")
    var tokenizer = transformers.AutoTokenizer.from_pretrained(HF_MODEL_NAME)
    var inputs = tokenizer(INPUT, None, None, None, True, 'max_length', True,
        seqlen, 0, False, None, 'np', True, None, False, False, False, False, True)

    print("generating input")
    var input_ids = inputs["input_ids"]
    var token_type_ids = inputs["token_type_ids"]
    var attention_mask = inputs["attention_mask"]
    var outputs = model.execute("input_ids", input_ids,
                                "token_type_ids", token_type_ids,
                                "attention_mask", attention_mask)
    var logits = outputs.get[DType.float32]("logits")
    var predicted_class_id = argmax_tensor(logits)
    var classification = tokenizer.config.id2label[predicted_class_id]
    print("The sentiment is:", classification)

def argmax_tensor(
    borrowed input: Tensor[DType.float32]
    ) -> Scalar[DType.float32]:
    var output = Tensor[DType.float32](TensorShape(1, 1))

    argmax(input._to_ndbuffer[2](), -1, output._to_ndbuffer[2]())

    return output[0]    

fn main():
    try: 
        infer()
    except:
        print("Problem reading file!")

The command line out:

➜ mojo ⚡                                                                                                                                                                  3.10.13 

mojo build helloworld.mojo 
➜ mojo ⚡                                                                                                                                                                   3.10.13  

./helloworld
loading options
creating inference session
helloworld: /__w/modular/modular/LLCL/include/LLCL/Runtime/Runtime.h:82: static M::LLCL::Runtime &M::LLCL::Runtime::getCurrentRuntime(): Assertion `runtime && "no runtime is associated with the current thread"' failed.
Aborted

There are no further steps done.

System information

- What OS did you do install MAX on ? Windows running in WSL 2
- Provide version information for MAX by pasting the output of `max -v` max 24.1.1 (0ab415f7)
- Provide version information for Mojo by pasting the output of `mojo -v` mojo 24.1.1 (0ab415f7)
- Provide Modular CLI version by pasting the output of `modular -v` modular 0.6.0 (04c05243)
ehsanmok commented 6 months ago

The model_path should refer to a saved torchscript roberta model. Is that so?

Sajtospoga01 commented 6 months ago

Yes, though the runtime crashes on the line var session = engine.InferenceSession(). I changed the path afterwards however the issue remains the same, it works with just in time but does not work when first built then executed.

So the way it works: either:

mojo helloworld.mojo

or

mojo run helloworld.mojo

what does not work:

mojo build helloworld.mojo

then

./helloworld
ehsanmok commented 6 months ago

Thanks for reporting! We are looking into this.

sparadiso commented 6 months ago

Out of curiosity @Sajtospoga01 -- does this issue persist with the latest v24.2 build (modular update max && modular update mojo)?

Sajtospoga01 commented 6 months ago

Yes, I just updated it to the latest version, and I am encountering the same error.

iamtimdavis commented 2 months ago

@Sajtospoga01 - are you able to try 24.4 and see if this is still occurring and re-open if it is? Thanks

Sajtospoga01 commented 1 month ago

Hi @iamtimdavis, Sorry for getting back so late, I just checked the release and it seems to be working good, thank you for the fix 🙂