enoche / BM3

Pytorch implementation for "Bootstrap Latent Representations for Multi-modal Recommendation"-WWW'23
GNU General Public License v3.0
50 stars 8 forks source link

how to extracted visual features from each product image ? #9

Closed superwood closed 9 months ago

superwood commented 1 year ago

how to extracted visual features from each product image ?
No relevant code and documentation were found in the repository?

enoche commented 1 year ago

Hi, @superwood Thanks for your reply. You may found the code here https://github.com/enoche/MMRec/tree/master/preprocessing

Actually, visual feature is already included in raw dataset.

Thanks.

Nipers commented 11 months ago

So if I want to apply BM3 on other datasets, how should I extract features from scratch with CNN model?

enoche commented 11 months ago

So if I want to apply BM3 on other datasets, how should I extract features from scratch with CNN model?

@Nipers Great question! Here you may use CNN/ViT to extract features from image:

CNN -- Kereas

This code assumes that you have already installed the necessary libraries (tensorflow, numpy, PIL) and that you have an image file named 'image.jpg' in your current directory.

import numpy as np
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import preprocess_input

# Load VGG16 model, include_top=False to load model without the fully-connected layers
model = VGG16(weights='imagenet', include_top=False)

# Load the image file and convert it to a numpy array
img_path = 'image.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

# Use the VGG16 model to extract features
features = model.predict(x)

# Print the extracted features
print(features)

ViT -- Pytorch

For Vision Transformer (ViT), you can use the timm library which has a wide variety of pre-trained models including ViT. Here is a simple example:

import torch
import timm
from PIL import Image

# Load the ViT model
model = timm.create_model('vit_base_patch16_224', pretrained=True)
model.eval()

# Load the image
img = Image.open('image.jpg')
img = img.resize((224, 224))

# Preprocess the image
img_tensor = torch.tensor(np.array(img)).permute((2, 0, 1)).unsqueeze(0).float() / 255.0

# Use the ViT model to extract features
with torch.no_grad():
    features = model.forward_features(img_tensor)

# Print the extracted features
print(features)

Please note that these are just basic examples. In a real-world scenario, you would likely want to add more preprocessing steps and possibly use a different model depending on your specific use case. Also, remember to install any necessary libraries before running the code. You can do this with pip:

pip install tensorflow numpy pillow
pip install torch timm
Nipers commented 11 months ago

Thank you for your response, I have made it.