allenai / savn

Learning to Learn how to Learn: Self-Adaptive Visual Navigation using Meta-Learning (https://arxiv.org/abs/1812.00971)
Apache License 2.0
185 stars 56 forks source link

some details on the image-frame ordering #3

Closed AruniRC closed 5 years ago

AruniRC commented 5 years ago

Hi,

can you please share the file format and structure for storing image features?

It seems your code reads in pre-computed features from an HDF5 dump for each frame that the agent can see in this line in your codebase: https://github.com/allenai/savn/blob/1cda8aff1722543450e72bc74374da80dc84c771/datasets/environment.py#L18

If we want to use other types of features (e.g. a different feature extractor than what you use to represent the frame image), it would be super helpful to have more details on how the feature dump is constructed, the ordering of the frame images, etc.

thank yoU!

mitchellnw commented 5 years ago

No problem,

What you could do:

https://github.com/allenai/savn/blob/1cda8aff1722543450e72bc74374da80dc84c771/datasets/offline_controller_with_small_rotation.py#L520

is in charge of reading the HDF5 file.

When it is initialized, it is given a link to this HDF5 file. It is initialized here:

https://github.com/allenai/savn/blob/1cda8aff1722543450e72bc74374da80dc84c771/datasets/environment.py#L30

where images_file_name initially comes from args.

https://github.com/allenai/savn/blob/1cda8aff1722543450e72bc74374da80dc84c771/episodes/basic_episode.py#L126

So if you want to use different features you should add

...
--images_file_name <new-features>
...

to your run.

You should place these features in thor_offline_data/FloorPlan<n>/<new-features>.

The images are read here

https://github.com/allenai/savn/blob/1cda8aff1722543450e72bc74374da80dc84c771/datasets/offline_controller_with_small_rotation.py#L825

so there should be a feature associates with each str(self.state).

The str method is here:

https://github.com/allenai/savn/blob/1cda8aff1722543450e72bc74374da80dc84c771/datasets/offline_controller_with_small_rotation.py#L59

What do we do:

We run the following script:


import json
import os
import time
import warnings
from collections import deque
from math import gcd
from multiprocessing import Process, Queue

from ai2thor.controller import BFSController
from datasets.offline_controller_with_small_rotation import ExhaustiveBFSController

def search_and_save(in_queue):
    while not in_queue.empty():
        try:
            scene_name = in_queue.get(timeout=3)
        except:
            return
        c = None
        try:
            out_dir = os.path.join(<path-to-where-you-want-data-to-go>, scene_name)
            if not os.path.exists(out_dir):
                os.mkdir(out_dir)

            print('starting:', scene_name)
            c = ExhaustiveBFSController(
                grid_size=0.25,
                fov=90.0,
                grid_file=os.path.join(out_dir, 'grid.json'),
                graph_file=os.path.join(out_dir, 'graph.json'),
                metadata_file=os.path.join(out_dir, 'metadata.json'),
                images_file=os.path.join(out_dir, 'images.hdf5'),
                depth_file=os.path.join(out_dir, 'depth.hdf5'),
                grid_assumption=False)
            c.start()
            c.search_all_closed(scene_name)
            c.stop()
        except AssertionError as e:
            print('Error is', e)
            print('Error in scene {}'.format(scene_name))
            if c is not None:
                c.stop()
            continue

def main():

    num_processes = 30

    queue = Queue()
    scene_names = []
    for i in range(2):
        for j in range(30):
            if i == 0:
                scene_names.append("FloorPlan" + str(j + 1))
            else:
                scene_names.append("FloorPlan" + str(i + 1) + '%02d' % (j + 1))
    for x in scene_names:
        queue.put(x)

    processes = []
    for i in range(num_processes):
        p = Process(target=search_and_save, args=(queue,))
        p.start()
        processes.append(p)

    for p in processes:
        p.join()

Note that AI2-THOR (https://github.com/allenai/ai2thor) has changed since we began this project so this script will not work out of the box -- it may require some changes. I am sorry about this and am happy to help if there are issues. The scenes in Thor themselves have also changed and are much better now.

This script does BFS in a scene and saves an HDF5 file in the desired format with all of the RGB images.

e.g. you will now have FloorPlan<n>/images.hdf5.

You can now simply iterate over these files to get the features that you want. We run a variant of the following script to get the ResNet features:

    for scene in scenes:
        images = h5py.File('{}/{}/images.hdf5'.format(data_dir, scene), 'r')
        features = h5py.File('{}/{}/{}.hdf5'.format(data_dir, scene, method), 'w')

        for k in images:
            frame = resnet_input_transform(images[k][:], 224)
            frame = torch.Tensor(frame)
            if torch.cuda.is_available():
                frame = frame.cuda()
            frame = frame.unsqueeze(0)

            v = model(frame)
            v = v.view(512, 7, 7)

            v = v.cpu().numpy()
            features.create_dataset(k, data=v)

        images.close()
        features.close()

where resnet_input_transform is

def resnet_input_transform(input_image, im_size):
    normalize = transforms.Normalize(
        mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    all_transforms = transforms.Compose([
        transforms.ToPILImage(),
        ScaleBothSides(im_size),
        transforms.ToTensor(),
        normalize,
    ])
    transformed_image = all_transforms(input_image)
    return transformed_image
mitchellnw commented 5 years ago

The script will also store the depth information as depth.hdf5 in addition to the RGB images as images.hdf5 which will be helpful if you want to generate features using RGBD data.

We apologize that the process described above is somewhat tedious, we should have anticipated that people would want to do what you are doing and made this process easier.

However, re-scraping the data is not a terrible idea as we recommend using the newest AI2-THOR in future projects. Almost a year of engineering has gone into improving Thor since we finished this project and there is a lot of new cool things you can do now (e.g. https://ai2thor.allenai.org/demo/).

AruniRC commented 5 years ago

Hi @mitchellnw thanks for a detailed response, will try to replicate as much as possible. How about an easier alternative (if feasible)? If you have the images.hdf5 corresponding to your current published experiments cached somewhere, then I can simply read in the rgb images from there and extract features using some CNN model of choice. Then the image <--> feature correspondence should be maintained. Does this sound reasonable?

mitchellnw commented 5 years ago

Yes, great idea. I will hopefully be able to locate these today. Would you want the depth information as well?

By the way, I checked out your work on unsupervised domain adaptation and it's really cool!

mitchellnw commented 5 years ago

this is hopefully what you need: floorplans_with_images

AruniRC commented 5 years ago

thanks a lot for the quick response @mitchellnw ! will do some sanity checks, and let's hope it works :)

mitchellnw commented 5 years ago

Closing this issue, please reopen if this is still an issue!

AruniRC commented 5 years ago

@mitchellnw -- sure. I had not worked on this much, after downloading the imagery. Trying to come up with a minimal working scenario to quickly verify this works (sorry to bug you again with this...):

My plan is to just train and test on Kitchen scenes, using features extracted on your shared imagery (instead of using the pre-extracted features your codebase provides). If this gets similar performance, that means there is a straightforward way to use any custom feature-representation of the scene images. This would mean:

  1. Extracting Resnet features on the FloorPlan"n" images of Kitchen scenes (from the link you shared: floorplans_with_images)

  2. Place these features under: thor_offline_data/FloorPlan<n>/<new-features>.

  3. Call savn/episodes/basic_episode.py with --images_file_name <new-features>

Does this sound ok? Also, is there a quick way to map from FloorPlann and Scene name (i.e. which FloorPlans correspond to Kitchens, if I just want to train and test on a split of kitchen scenes?)

thank you!

mitchellnw commented 5 years ago

Yes, sounds good, good luck! And, yes, the kitchens 1-30.

AruniRC commented 5 years ago

Another question -- there any distinction between FloorPlan"n" and FloorPlan"n"_physics in terms of just the images.hpy? Some of the FloorPlans have the suffix "_physics" (.e.g FloorPlan1_physics), but others do not.

mitchellnw commented 5 years ago

You shouldn't have to worry about that, the "_physics" scenes come from a slightly newer (better) version of AI2-THOR. When we were doing the project only Kitchens and Living Rooms were completed.

By the way -- I may not recommend scraping a feature for each image. In hindsight, this may have been detrimental to performance as then you can't really do data augmentation. I would recommend running your featurizer on the fly.

AruniRC commented 5 years ago

Thanks for the info.

So the features I am planning to use are pretty heavy (basically run a detector on each frame, which can have a fairly large overhead to keep around in memory at runtime). If data augmentation is not done at any phase, then the relative performance would still be valid I think.... but it's a good point, i'll try to work around the memory footprint eventually.

xuai05 commented 3 years ago

I would like to ask if you have encountered such a situation before you run the program, and you get stuck at the following interface:

Unable to preload the following plugins: ScreenSelector.so Unable to preload the following plugins: ScreenSelector.so Loading player data from /home/ubuntu/.ai2thor/releases/thor-201903131714-Linux64/thor-201903131714-Linux64_Data/data.unity3d Loading player data from /home/ubuntu/.ai2thor/releases/thor-201903131714-Linux64/thor-201903131714-Linux64_Data/data.unity3d