UbiquitousLearning / mllm

Fast Multimodal LLM on Mobile Devices
https://ubiquitouslearning.github.io/mllm_website
MIT License
394 stars 48 forks source link

Trying to add custom models but cannot run properly on Android #78

Closed TiAmoLip closed 5 months ago

TiAmoLip commented 5 months ago

I am trying to add tinyllama chat to Android(Llama2 demo is a success) but failed. My operations are listed below:

lx200916 commented 5 months ago

Look into it....

lx200916 commented 5 months ago

Seems cool in codes. Could u check if there is any ERROR in logcat? maybe tagged with "MLLM" or other related things.

TiAmoLip commented 5 months ago

Sorry, I haven't found anything useful in logcat...

For llama2, the log tagged with MLLM is

Setup!
param size:291 // 201 for tiny
MODEL TYPE:0 // 2 for tiny
tokenizer size:31902

which means the try catch in JNIBridge.run never catches exception.

And tag "chatViewModel" are the same for llama2 and tiny, both "files:0", but I am wondering what does the param size indicates, it seems not to be the disk occupancy. Besides them, the annoying "BLASTBufferQueue" occupies the window.

p.s. I forget to tell you that I also add TINY to PreDefinedModel in LibHelper.hpp in Android Studio, although it may not matter.

lx200916 commented 5 months ago

That's weird... I may need some time to reproduce the scenario. BTW you are using the Q4K tinyllama weights from our HF right?

TiAmoLip commented 5 months ago

Yes, the weights (tinyllama-1.1b-chat-q4_k.mllm) are downloaded from your HF and renamed, and the vocab is from this repo.

TiAmoLip commented 5 months ago

Maybe my codes are never executed?...

I add LOGI to run function in LibHelper.cpp, run build_android_app.sh and replace libmllm_lib.a.

...
    for (int step = 0; step < max_step; step++) {
        unsigned int token_idx;
        if (model_ == FUYU) {
            LOGI("Image Patch!");
            executor_->run(net_, {input_id, img_patch, img_patch_id});
            auto result = executor_->result();
            token_idx = postProcessing(result[0], input_id);
            fullTensor(img_patch, net_, {0, 0, 0, 0}, 1.0F);
            fullTensor(img_patch_id, net_, {0, 0, 0, 0}, 1.0F);
        } else {
            executor_->run(net_, {input});
            auto result = executor_->result();
            token_idx = postProcessing(result[0], input);

/*new*/
            LOGI("MLLM TINY token_idx:%d", token_idx);
/*new*/

        }
        const auto out_token = tokenizer_->detokenize({token_idx});

/*new*/        
        LOGI("MLLM TINY out_token:%s", out_token.c_str());
/*new*/

        if (out_token == "</s>" || token_idx == eos_id_) {
            callback_(out_string, true);
            break;
        }
        out_string += out_token;
        callback_(out_string, step == max_step - 1);
    }

However neither tiny nor llama2 output these tags. I am doubting whether my previous operations are correct.

lx200916 commented 5 months ago

Hi, I managed to set up the env and succeedfully run the model. I think there may be several things to note:

  1. Check the sendMessage fun(ChatViewModel.sendMessage) We add a double check here.[Actually we do not expect any new model in our Demo App...so terribly sorry😭]
    if (arrayOf(0,2).contains(modelType.value)){
    //            if (modelType.value == 0){
                CoroutineScope(Dispatchers.IO).launch {
    //                val run_text = "A dialog, where User interacts with AI. AI is helpful, kind, obedient, honest, and knows its own limits.\nUser: ${message.text}"
                    JNIBridge.run(bot_message.id,message.text,100)
                }
            }
  2. Check if any switch-case expression in LibHelper.cpp is ended with break; properly.
TiAmoLip commented 5 months ago

Thank you so much! The model now can run properly.

lx200916 commented 4 months ago

Thank you so much! The model now can run properly.

Glad to hear that! You can also make a pull request about this feature if you like so other can benefit too 😘