yurijmikhalevich / rclip

AI-Powered Command-Line Photo Search Tool
MIT License
736 stars 57 forks source link

feat: add support for running `rclip` in a notebook #105

Closed sugizo closed 5 months ago

sugizo commented 5 months ago

steps pip install --extra-index-url https://download.pytorch.org/whl/cpu rclip

execute (all three got the same result)

!rclip https://raw.githubusercontent.com/yurijmikhalevich/rclip/main/tests/e2e/images/cat.jpg
!rclip horse + stripes
!rclip apple - fruit

result

Traceback (most recent call last):
  File "/usr/local/bin/rclip", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/rclip/main.py", line 205, in main
    arg_parser = helpers.init_arg_parser()
  File "/usr/local/lib/python3.10/dist-packages/rclip/utils/helpers.py", line 78, in init_arg_parser
    textwrap.fill(
  File "/usr/lib/python3.10/textwrap.py", line 399, in fill
    return w.fill(text)
  File "/usr/lib/python3.10/textwrap.py", line 371, in fill
    return "\n".join(self.wrap(text))
  File "/usr/lib/python3.10/textwrap.py", line 362, in wrap
    return self._wrap_chunks(chunks)
  File "/usr/lib/python3.10/textwrap.py", line 256, in _wrap_chunks
    raise ValueError("invalid width %r (must be > 0)" % self.width)
ValueError: invalid width -2 (must be > 0)

best regards

yurijmikhalevich commented 5 months ago

@sugizo, are you running this in the notebook?

rclip doesn't support notebooks yet; I'll keep this issue open to track progress on adding notebooks support.

yurijmikhalevich commented 5 months ago

@sugizo, as a workaround, you may import rclip directly into the notebook, example:

from rclip.main import init_rclip

_, model, rclipDB = init_rclip(DATA_DIR, BATCH_SIZE, DEVICE)  # this will also index `DATA_DIR`
tag_features = model.compute_text_features([f'photo of {tag}' for tag in tags])
for image in tqdm(rclipDB.get_image_vectors_by_dir_path(current_directory), unit='images'):
  image_features = np.frombuffer(image['vector'], np.float32)
  # do whatever you need with `image_features` and `tag_features`; compute similarity, etc
sugizo commented 5 months ago

yes, running in notebook here's the detail

info

code

import os
from rclip.main import init_rclip
DATASET_DIR = os.getenv('DATASET_DIR', os.path.join(os.path.dirname(__file__), 'datasets', 'objectnet-1.0') )
BATCH_SIZE = int(os.getenv('BATCH_SIZE', 256) )
DEVICE = os.getenv('DEVICE', 'cpu')
_, model, rclipDB = init_rclip(DATA_DIR, BATCH_SIZE, DEVICE)  # this will also index `DATA_DIR`
tag_features = model.compute_text_features([f'photo of {tag}' for tag in tags] )
for image in tqdm(rclipDB.get_image_vectors_by_dir_path(current_directory), unit = 'images'):
    image_features = np.frombuffer(image['vector'], np.float32)

result

NameError                                 Traceback (most recent call last)
[<ipython-input-6-51821106f897>](https://localhost:8080/#) in <cell line: 4>()
      2 from rclip.main import init_rclip
      3 
----> 4 DATASET_DIR = os.getenv('DATASET_DIR', os.path.join(os.path.dirname(__file__), 'datasets', 'objectnet-1.0'))
      5 BATCH_SIZE = int(os.getenv('BATCH_SIZE', 256))
      6 DEVICE = os.getenv('DEVICE', 'cpu')

NameError: name '__file__' is not defined

best regards

yurijmikhalevich commented 5 months ago

@sugizo, __file__ is a special variable, which is not defined in Python code running in notebooks. When calling init_rclip like that, you have to construct and pass the path to the directory with the images you want to process as the first argument.

yurijmikhalevich commented 5 months ago

@sugizo, version 1.8.6 got published with the fix for the issue you encountered. Can you please confirm that the issue is solved for you, too?

Still, it's better to use rclip in the terminal; some features, like image preview, won't work in notebooks. Alternatively, you can use rclip as a library in notebooks, as I showed before, but this "library" API isn't public and may change between versions.

What are you trying to solve with rclip? If I better understand your use-case, I may have more ideas.

sugizo commented 5 months ago

steps

pip install --extra-index-url https://download.pytorch.org/whl/cpu rclip
!wget -c 'https://i.ytimg.com/vi/vEYsdh6uiS4/maxresdefault.jpg'
import os
from rclip.main import init_rclip
DATASET_DIR = os.getenv('DATASET_DIR', os.path.join(os.path.dirname('./') ) )
BATCH_SIZE = int(os.getenv('BATCH_SIZE', 256) )
DEVICE = os.getenv('DEVICE', 'cpu')
_, model, rclipDB = init_rclip(DATASET_DIR, BATCH_SIZE, DEVICE)  # this will also index `DATASET_DIR`
tag_features = model.compute_text_features([f'photo of {tag}' for tag in tags] )
for image in tqdm(rclipDB.get_image_vectors_by_dir_path(current_directory), unit = 'images'):
    image_features = np.frombuffer(image['vector'], np.float32)

result

checking images in the current directory for changes; use "--no-indexing" to skip this if no images were added, changed, or removed
100%|██████████| 1/1 [00:00<00:00, 94.34images/s] 

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
[<ipython-input-5-1fcf6811c109>](https://localhost:8080/#) in <cell line: 7>()
      5 DEVICE = os.getenv('DEVICE', 'cpu')
      6 _, model, rclipDB = init_rclip(DATASET_DIR, BATCH_SIZE, DEVICE)  # this will also index `DATASET_DIR`
----> 7 tag_features = model.compute_text_features([f'photo of {tag}' for tag in tags] )
      8 for image in tqdm(rclipDB.get_image_vectors_by_dir_path(current_directory), unit = 'images'):
      9     image_features = np.frombuffer(image['vector'], np.float32)

NameError: name 'tags' is not defined