replit / ReplitLM

Inference code and configs for the ReplitLM model family
https://huggingface.co/replit
Apache License 2.0
918 stars 75 forks source link

Warnings and Errors when generating with the given code #9

Closed Symbolk closed 1 year ago

Symbolk commented 1 year ago

A nice model for code generation! I'd like to test this model on other languages of HumanEval, and here is my code:

from transformers import AutoModelForCausalLM, AutoTokenizer
from tqdm import tqdm
import os
import json
from loguru import logger
import logging
import torch

logger.add("output_go.log")

os.environ['CURL_CA_BUNDLE'] = ""
os.environ["CUDA_VISIBLE_DEVICES"] = "7"

logger.info('loading model...')
tokenizer = AutoTokenizer.from_pretrained('replit/replit-code-v1-3b', trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained('replit/replit-code-v1-3b', trust_remote_code=True)
device = 'cuda:7'
model.to(device=device )
logger.info('model loaded.')

lines = []
with open('humaneval_go.jsonl', 'r') as fr:
    lines = fr.readlines()

logger.info(len(lines))

fw = open('output_go.jsonl', 'a', encoding='utf-8')
for i, line in tqdm(enumerate(lines)):
    logger.info(i)
    task = json.loads(line)
    x = tokenizer.encode(task['prompt'], return_tensors='pt').to(device=device)
    y = model.generate(x, max_length=768, do_sample=True, top_p=0.95, top_k=4, temperature=0.2, num_return_sequences=1,
                       eos_token_id=tokenizer.eos_token_id)

    # decoding, clean_up_tokenization_spaces=False to ensure syntactical correctness
    generated_code = tokenizer.decode(y[0], skip_special_tokens=True, clean_up_tokenization_spaces=False)
    result = {
        'question_id': f'HumanEval/{i}',
        'snippets': [
            generated_code
        ]
    }
    fw.write(json.dumps(result) + '\n')
    logger.info(generated_code)

fw.close()

Running on CPU seems slowly but ok, if we ignore such warnings (how to get avoid them?):

/root/.cache/huggingface/modules/transformers_modules/replit/replit-code-v1-3b/9eceafb041eb8abd565dabfbfadd328869140011/attention.py:290: UserWarning: Using `attn_impl: torch`. If 
your model does not use `alibi` or `prefix_lm` we recommend using `attn_impl: flash` otherwise we recommend using `attn_impl: triton`.                                              
  warnings.warn(                                                                                                                                                                    
You are using config.init_device='cpu', but you can also use config.init_device="meta" with Composer + FSDP for fast initialization.     

(Apologize in advance since maybe these issues are from newbies, but I do believe that a complete demo inference code will save us a lot!)

madhavatreplit commented 1 year ago

Thanks for your discussion!

The warnings are coming from the replit_lm.py file where the authors have inserted them to warn/recommend users when customized configuration for the model is used. The warnings python library is used here.

You can safely ignore them if using our default configs as described in the README.

If you really want to avoid logging them, I think you can add something like this to your main script:

import warnings
warnings.filterwarnings("ignore")

This will have side-effects on other warnings related usage in our codebase so not the best way to do this.

Hope that helps!

madhavatreplit commented 1 year ago

Closing. Author can reopen if needed.