Closed nikky4D closed 4 years ago
Did you figure this out?
Hi @nikky4D and @soumilkanwal80 :
We don't support this feature out of the box. But it is possible with little hack in CocoCaptionsEvalDataset
(https://github.com/kdexd/virtex/blob/master/virtex/data/datasets/downstream.py#L240-L277).
You could modify it to work with an image folder and use image filenames as keys.
On top of that, you can use https://github.com/kdexd/virtex/blob/master/scripts/eval_captioning.py to get results.
You can get started on this. In general it's a nice feature to have, I will add it in.
I tried to use custom dataset class:
from PIL import Image
class FileList(Dataset):
r"""
A dataset which provides only images (for inference) from the folder
dataset.
Parameters
----------
files: list, required
list of paths to the images
image_tranform: Callable, optional (default = virtex.data.transforms.DEFAULT_IMAGE_TRANSFORM)
A list of transformations, from either `albumentations
<https://albumentations.readthedocs.io/en/latest/>`_ or :mod:`virtex.data.transforms`
to be applied on the image.
"""
def __init__(
self,
files,
image_transform: Callable = T.DEFAULT_IMAGE_TRANSFORM,
):
self.files = files
self.image_transform = image_transform
def __len__(self):
return len(self.files)
def __getitem__(self, idx: int):
image_id, image = idx, np.array(Image.open(self.files[idx]))
image = self.image_transform(image=image)["image"]
image = np.transpose(image, (2, 0, 1))
return {
"image_id": torch.tensor(image_id).long(),
"image": torch.tensor(image),
}
but faced with another problem:
OSError: Not found: "datasets/vocab/coco_10k.model": No such file or directory Error #2
@kdexd Could you provide sentencepiece pretrained models?
SentencePiece vocab and model and be generated in a few seconds by a simple command (requires COCO train2017 captions): https://kdexd.github.io/virtex/virtex/usage/setup_dependencies.html#preprocess-data
Vocab/model generation will be deterministic, given you use the same annotations.
It really turned out to be very simple, thank you.
It really turned out to be very simple, thank you.
Can you give me a sample of the code you used and your setup?
It really turned out to be very simple, thank you.
Can you give me a sample of the code you used and your setup?
Setup
Ubuntu 18.04 and installed all the required dependencies for this repo.
Code
New main function in scripts/eval_captioning.py
:
import glob
from virtex.data import FileList
def main(_A: argparse.Namespace):
if _A.num_gpus_per_machine == 0:
# Set device as CPU if num_gpus_per_machine = 0.
device = torch.device("cpu")
else:
# Get the current device (this will be zero here by default).
device = torch.cuda.current_device()
_C = Config(_A.config, _A.config_override)
tokenizer = TokenizerFactory.from_config(_C)
files = glob.glob('path/to/images/*jpg')
val_dataloader = DataLoader(
FileList(files),
batch_size=_C.OPTIM.BATCH_SIZE,
num_workers=_A.cpu_workers,
pin_memory=True,
)
# Initialize model from a checkpoint.
model = PretrainingModelFactory.from_config(_C).to(device)
ITERATION = CheckpointManager(model=model).load(_A.checkpoint_path)
model.eval()
for val_iteration, val_batch in enumerate(val_dataloader, start=1):
for key in val_batch:
val_batch[key] = val_batch[key].to(device)
# Make a dictionary of predictions in COCO format.
with torch.no_grad():
output_dict = model(val_batch)
for image_id, caption in zip(
val_batch["image_id"], output_dict["predictions"]
):
print(files[image_id], tokenizer.decode(caption.tolist()))
Also change virtex/data/__init__.py
for importing FileList class:
from .datasets.captioning import CaptioningDataset
from .datasets.multilabel import MultiLabelClassificationDataset
from .datasets.downstream import (
ImageNetDataset,
INaturalist2018Dataset,
VOC07ClassificationDataset,
CocoCaptionsEvalDataset,
FileList,
)
__all__ = [
"CaptioningDataset",
"MultiLabelClassificationDataset",
"CocoCaptionsEvalDataset",
"ImageNetDataset",
"INaturalist2018Dataset",
"VOC07ClassificationDataset",
"FileList",
]
Looks very neat, glad you got this working! I will add this feature by end of week.
I added this feature in master! Main additions are ImageDirectoryDataset and its usage in scripts/eval_captioning.py.
Refer updated instructions here:
Closing this issue for now. Feel free to re-open for any questions or issues!
Thanks so much everyone.
Hi there, I am also trying to run captioning on a folder of sample images on my machine. After generating the coco_10k.vocab file and correctly set the path for the model & config file in the example command line, I ran the command at the bottom of the documentation page: https://kdexd.github.io/virtex/virtex/usage/downstream.html but I got the following error:
File "scripts/eval_captioning.py", line 113, in <module>
main(_A)
File "scripts/eval_captioning.py", line 86, in main
"image_id": image_id.item(),
AttributeError: 'str' object has no attribute 'item'
Can you please help me figure out what is wrong with my process? Thank you!
@freeIsa: Oops, I think you encountered an edge case — your image file names may be alphabet characters (not numbers). I have handled this edge case, please pull from master! Let me know if you face any issues :-)
Thank you @kdexd, now it's working! 🎉
I added this feature in master! Main additions are ImageDirectoryDataset and its usage in scripts/eval_captioning.py.
Refer updated instructions here:
Closing this issue for now. Feel free to re-open for any questions or issues!
hi, @kdexd i followed the newest instructions for running Image Captioning Inference on my own images, but i also met such error when running eval_captioning.py, did i miss something before running script?
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
OSError: Not found: "datasets/vocab/coco_10k.model": No such file or directory Error #2
Hi @08tjlys , please follow Step 1 here: http://kdexd.xyz/virtex/virtex/usage/setup_dependencies.html#preprocess-data
Hi,
I would like to evaluate your work on a single image for image captioning. Can you tell me the steps I should follow for a single input? For instance, given a folder of images, how would I use your model for inference only on the folder of images?
Looking at captioning-task from your description, I am not sure how to go about using my own dataset for evaluation of the model.
Thanks