ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.22k stars 1.19k forks source link

Cannot install ludwig in Kaggle notebook #3814

Open yogeshhk opened 12 months ago

yogeshhk commented 12 months ago

Describe the bug Wish to run Ludwig examples in Kaggle notebook, but not able to install it.

To Reproduce Steps to reproduce the behavior:

  1. Go to 'kaggle.com'
  2. Click on 'Create', New Notebook
  3. Scroll down to create a new Code cell
  4. Type
    !pip uninstall -y tensorflow --quiet
    !pip install ludwig
    !pip install ludwig[llm]

See error as:

ERROR: Cannot uninstall pyyaml 6.0.1, RECORD file not found. You might be able to recover from this via: 'pip install --force-reinstall --no-deps pyyaml==6.0.1'.

Please provide code, yaml config file and a sample of data in order to entirely reproduce the issue. Issues that are not reproducible will be ignored.

Expected behavior ludwig should install without errors.

Screenshots image

Environment (please complete the following information):

Default-standard Kaggle Notebook environment

Additional context Add any other context about the problem here.

arnavgarg1 commented 11 months ago

Hi @yogeshhk! Thanks for flagging this issue - I was able to repro it. There definitely seems to be a strange error with PyYAML installation out of the box, which is something we can look into.

For now, I found that running these sequence of commands helps get you a stable Ludwig installation without any errors:

!sudo pip uninstall -y tensorflow --quiet
!sudo pip install ludwig
!sudo pip install ludwig[llm]

The only difference is adding sudo permissions, which seems to somehow force/override whatever PyYAML installation is already within the Kaggle notebook environment, I think. One thing to note is that installation seems a bit slow since it seems to also force install cuda-related dependencies like Python's wrapper over cuda runtime etc.

Let me know if this helps unblock you.

yogeshhk commented 11 months ago

Thanks @arnavgarg1 That seems to help, the installation was successful. Later in the notebook, some errors came

import torch gave

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[10], line 5
      3 import logging
      4 import os
----> 5 import torch
      6 import yaml
      8 from ludwig.api import LudwigModel

File /opt/conda/lib/python3.10/site-packages/torch/__init__.py:1253
   1251 import torch.backends.openmp
   1252 import torch.backends.quantized
-> 1253 import torch.utils.data
   1254 from torch import __config__ as __config__
   1255 from torch import __future__ as __future__
...
...

File /opt/conda/lib/python3.10/site-packages/dill/_dill.py:168
    166 try:
    167     from _pyio import open as _open
--> 168     PyTextWrapperType = get_file_type('r', buffering=-1, open=_open)
    169     PyBufferedRandomType = get_file_type('r+b', buffering=-1, open=_open)
    170     PyBufferedReaderType = get_file_type('rb', buffering=-1, open=_open)

File /opt/conda/lib/python3.10/site-packages/dill/_dill.py:156, in get_file_type(*args, **kwargs)
    154 def get_file_type(*args, **kwargs):
    155     open = kwargs.pop("open", __builtin__.open)
--> 156     f = open(os.devnull, *args, **kwargs)
    157     t = type(f)
    158     f.close()

File /opt/conda/lib/python3.10/_pyio.py:282, in open(file, mode, buffering, encoding, errors, newline, closefd, opener)
    280     return result
    281 encoding = text_encoding(encoding)
--> 282 text = TextIOWrapper(buffer, encoding, errors, newline, line_buffering)
    283 result = text
    284 text.mode = mode

File /opt/conda/lib/python3.10/_pyio.py:2045, in TextIOWrapper.__init__(self, buffer, encoding, errors, newline, line_buffering, write_through)
   2043         encoding = "utf-8"
   2044     else:
-> 2045         encoding = locale.getpreferredencoding(False)
   2047 if not isinstance(encoding, str):
   2048     raise ValueError("invalid encoding: %r" % encoding)

TypeError: <lambda>() takes 0 positional arguments but 1 was given

commenting it... gave error

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[11], line 8
      5 # import torch
      6 import yaml
----> 8 from ludwig.api import LudwigModel
     11 os.environ["HUGGING_FACE_HUB_TOKEN"] = getpass.getpass("Token:")
     12 assert os.environ["HUGGING_FACE_HUB_TOKEN"]

ModuleNotFoundError: No module named 'ludwig'

is it possible to try just importing these on your side?

My notebook is shared if you want to see the context https://www.kaggle.com/code/yogeshkulkarni/midcurvellm-finetune-ludwig

HamidRezaAttar commented 11 months ago

Hi, @yogeshhk could you find a workaround for this problem? I'm having the same issue.

yogeshhk commented 11 months ago

No @HamidRezaAttar , I am assuming Ludwig may look at it in the times to come [@alexsherstinsky @arnavgarg1 ]