ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.21k stars 1.19k forks source link

Cannot run/install finetuning colab notebook #3881

Closed dotXem closed 10 months ago

dotXem commented 10 months ago

Describe the bug

The demo colab notebook for finetuning Llama-2-7b is crashing at the third runnable cell when trying to import torch.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-3-dac5961b998e>](https://localhost:8080/#) in <cell line: 5>()
      3 import logging
      4 import os
----> 5 import torch
      6 import yaml
      7 

12 frames
[/usr/lib/python3.10/_pyio.py](https://localhost:8080/#) in __init__(self, buffer, encoding, errors, newline, line_buffering, write_through)
   2043                 encoding = "utf-8"
   2044             else:
-> 2045                 encoding = locale.getpreferredencoding(False)
   2046 
   2047         if not isinstance(encoding, str):

TypeError: <lambda>() takes 0 positional arguments but 1 was given

To Reproduce

  1. Go to https://colab.research.google.com/drive/1r4oSEwRJpYKBPM0M0RSh0pBEYK_gBKbe
  2. Connect T4 GPU
  3. Run the first three cells
  4. Last cell should fail with the error message

Expected behavior It should work!

Environment (please complete the following information):

(not sure if relevant)

arnavgarg1 commented 10 months ago

Hi @dotXem! Thanks for reporting the issue - I can confirm that I'm able to repro it with the steps you've provided. Let me get back to you with a root cause and fix soon! Apologies that this didn't work as expected out of the box.

arnavgarg1 commented 10 months ago

@dotXem I've found the issue and I've updated the notebook(s) on the Ludwig README including the one you're trying - are you able to give it a quick run through to see if the issue is fixed?

For context, it seems like the way we were setting UTF8 encoding as the default wasn't interplaying nicely with torch 2.1, and it seems like we weren't using the recommended way. I just updated it to use the preferred method and it seems to work well.

This is what I changed

Current:

import locale; locale.getpreferredencoding = lambda: "UTF-8"

New:

import locale; locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')

Let me know how it goes!

dotXem commented 10 months ago

It's working ! Thanks for the quick fix !

Le lun. 15 janv. 2024, 19:07, Arnav Garg @.***> a écrit :

@dotXem https://github.com/dotXem I've found the issue and I've updated the notebook(s) on the Ludwig README including the one you're trying - are you able to give it a quick run through to see if the issue is fixed?

For context, it seems like the way we were setting UTF8 encoding as the default wasn't interplaying nicely with torch 2.1, and it seems like we weren't using the recommended way. I just updated it to use the preferred method and it seems to work well.

This is what I changed

Current:

import locale; locale.getpreferredencoding = lambda: "UTF-8"

New:

import locale; locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')

Let me know how it goes!

— Reply to this email directly, view it on GitHub https://github.com/ludwig-ai/ludwig/issues/3881#issuecomment-1892600865, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC3IM4YWCF5YL45TJYC2TH3YOVV7RAVCNFSM6AAAAABB3S6RZCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJSGYYDAOBWGU . You are receiving this because you were mentioned.Message ID: @.***>