google-research / kubric

A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.
Apache License 2.0
2.27k stars 223 forks source link

How to generate object detection dataset using kubric? #235

Closed Bin-ze closed 1 year ago

Bin-ze commented 2 years ago

Can kubric be used to generate object detection data in virtual scenes? If so, how can it be implemented? Thanks for your replies! ! !

MrXandbadas commented 2 years ago

Hello!

Lets take a look at the examples folder, specifically we will arbitrarily choose helloworld.py (seen here)

kb.write_png(frame["rgba"], "output/helloworld.png") kb.write_palette_png(frame["segmentation"], "output/helloworld_segmentation.png") scale = kb.write_scaled_png(frame["depth"], "output/helloworld_depth.png") logging.info("Depth scale: %s", scale)

Lets dive into the file that defines these in Kubric/file_io.py kb.write_png is defined on line 93 seen here: def write_png(data: np.array, filename: PathLike) -> None: """Writes data as a png file (and convert datatypes if necessary).""" Line 134 (seen here) defines the write_palette_png function def write_palette_png(data: np.array, filename: PathLike,palette: np.ndarray = None): """Writes grayscale data as pngs to path using a fixed palette (e.g. for segmentations)."""

Upon further investigation it seems as though there is a dictionary of default data handlers that have been built. DEFAULT_WRITERS (seen here)

Lets quickly look at it!

DEFAULT_WRITERS = { "rgb": write_rgb_batch, "rgba": write_rgba_batch, "depth": write_depth_batch, "uv": write_uv_batch, "normal": write_normal_batch, "flow": write_flow_batch, "forward_flow": write_forward_flow_batch, "backward_flow": write_backward_flow_batch, "segmentation": write_segmentation_batch, "object_coordinates": write_coordinates_batch, }

I'm guessing something will need to be done with one of these to actually build the boxes from the cameras angle?


I kept clicking and found: Something about computing bboxes in the example bouncing_balls.py here


Edit 26/05/2022

I've done more looking and you probably want somthing as seen here in Challenges

"bboxes_3d": (k, s, 8, 3) [float32] World-space corners of the 3D bounding box around the object. "bboxes": (k, None, 4) [float32] The normalized image-space (2D) coordinates of the bounding box [ymin, xmin, ymax, xmax] for all the frames in which the object is visible (as specified in bbox_frames). "bbox_frames": (k, None) [int] A list of all the frames the object is visible.

Checkout do the bottom of Movi_ab_worker.py
specifically where they get the segmentation mask and use it to build the bboxes. 3dbboxes are also possible from what ive been reading. This line is where it really starts.

Bin-ze commented 2 years ago

Thank you very much for your reply!

I checked your reply, but still have some questions, how can I build a virtual scene, and then generate a 2D target detection frame for the target of interest in the scene?

The example of hello world.py shows how to generate depth map and segmentation map. Obviously, it is easier to generate boxes, but how to generate specific scenes? For example, I want to generate random people and vehicles in the street scene, label them with bbox, and finally export the labels? I would appreciate if you could give me more specific advice!

MrXandbadas commented 2 years ago

You want to be able to export the images with the bboxes and labels on them?

I'll try to whip together an example of something that does that (don't hold me to it. EDIT: Yeah, dont hold me to it, trying to get this done sucks ass when you barley know what everything does) Although to my knowledge the Pipeline doesn't currently draw the boxes for you, It just outputs the metadata information required to draw the said bboxes.

That would be another processing step you would take externally. So you would generate your dataset in Kubric then run that dataset through some kind of handler that copies the RGB (or RGBA) image X amount of times, each time drawing the next box in sequence according to the metadata file.

in short: Generate the bboxes metadata then create a new python project to take the files created by the pipeline and further process the files as you need them (using OpenCV or something like that to parse the JSON objects in the metadata file and draw the boxes accordingly)


Edit 31/05/2022

Generate your Dataset from the Kubric Pipeline. This is what the buttttttt end of what a Worker File looks like inside the Kubric Pipeline

logging.info("Rendering the scene ...")
data_stack = renderer.render()

# --- Postprocessing
kb.compute_visibility(data_stack["segmentation"], scene.assets)
visible_foreground_assets = [asset for asset in scene.foreground_assets
                             if np.max(asset.metadata["visibility"]) > 0]
visible_foreground_assets = sorted(  # sort assets by their visibility
    visible_foreground_assets,
    key=lambda asset: np.sum(asset.metadata["visibility"]),
    reverse=True)

data_stack["segmentation"] = kb.adjust_segmentation_idxs(
    data_stack["segmentation"],
    scene.assets,
    visible_foreground_assets)
scene.metadata["num_instances"] = len(visible_foreground_assets)

# Save to image files
kb.write_image_dict(data_stack, output_dir)
kb.post_processing.compute_bboxes(data_stack["segmentation"],
                                  visible_foreground_assets)

# --- Metadata
logging.info("Collecting and storing metadata for each object.")
kb.write_json(filename=output_dir / "metadata.json", data={
    "flags": vars(FLAGS),
    "metadata": kb.get_scene_metadata(scene),
    "camera": kb.get_camera_info(scene.camera),
    "instances": kb.get_instance_info(scene, visible_foreground_assets),
})
kb.write_json(filename=output_dir / "events.json", data={
    "collisions":  kb.process_collisions(
        collisions, scene, assets_subset=visible_foreground_assets),
})

kb.done()

You could also just use the Movi-F Example to generate data or as a full example in the Kubric Pipeline.

Disclaimer

This is not inside of Kubric, This is something I wrote that USES the data made above by Kubric!

Lets walk through how to use it for anyone else looking (and for future me xD )

Get a fresh dev enviroment and install openCV Get all your RGBA files and load them into a folder. For example, all my work was done in a folder called "output" to run the program: bbox_app.py --label --file --path enter/path/to/piccies&metadata.json --label: will use the asset_id to label items. --file: will save the files --show: will instead use opencv to display each image, press esc to continue through the images --path [path_of_data_tocrunch]: make sure you point to the file where the rgba images and the metadata.sjon file get saved.

Look a load of text.. Can't promise it will work for you or note break something so please do take care and have a read through to make sure it suites your application

import json
import os
import cv2 as cv
import argparse

mefile = "metadata.json"
filestarter = "rgba_"
endpng = ".png"
folderspace = "bboxdraw"

def dostuff(**kwargs):
    workspace = None
    mepng = "0000"
    if kwargs["path"]:
        workspace = kwargs["path"]
        mefile = str(workspace) + "/metadata.json"
    if workspace is None:
        workspace = "."

    f = open(mefile)
    data = json.load(f)
    height, width = data["metadata"]["resolution"]
    num_frames = data["metadata"]["num_frames"]
    frame_count = 0

    while frame_count < num_frames:
        print(frame_count)
        if frame_count >= 10:
            mepng = "000"

        elif frame_count >= 100:
            mepng = "00"
        if kwargs["path"]:
            if filestarter:
                name = str(workspace) + "/" + filestarter + mepng + str(frame_count) + endpng
                name_no_workspace = filestarter + mepng + str(frame_count) + endpng
            else:
                name = str(workspace) + "/" + mepng + str(frame_count) + endpng
                name_no_workspace = mepng + str(frame_count) + endpng
        else:
            if filestarter:
                name = filestarter + mepng + str(frame_count) + endpng
                print(name)
            else:
                name = mepng + str(frame_count) + endpng
                print(name)

        # Open the image
        image = cv.imread(name)
        # Make a Copy
        alltogether = image.copy()

        for i in data["instances"]:
            imageblank = image.copy()
            print(f"Hey " + str(len(i["bbox_frames"])))
            try:
                ouritem = i["bbox_frames"]
                if ouritem[frame_count]:
                    print(f"YES for {frame_count}")
                if i["bbox_frames"][frame_count] is frame_count:
                    item = i["bboxes"][frame_count]
                y_min = int(item[0] * height)
                x_min = int(item[1] * width)
                y_max = int(item[2] * height)
                x_max = int(item[3] * width)

            except Exception as e:
                print(f"[EXCEPTION] {e}")
            finally:

                # you need top-left corner and bottom-right corner of rectangle
                cv.rectangle(alltogether, (x_min, y_min), (x_max, y_max), (255, 0, 0))
                cv.rectangle(imageblank, (x_min, y_min), (x_max, y_max), (255, 0, 0))
            #Label image
            if kwargs["label"]:
                cv.putText(alltogether, i["asset_id"], (x_min, y_min), cv.FONT_HERSHEY_SIMPLEX, 0.3, (0, 255, 0))
                cv.putText(imageblank, i["asset_id"], (x_min, y_min), cv.FONT_HERSHEY_SIMPLEX, 0.3, (0, 255, 0))

            #Save image
            if kwargs["file"]:
                if not os.path.exists(f"{workspace}\\{folderspace}"):
                    os.makedirs(f"{workspace}\\{folderspace}")
                if kwargs["path"]:
                    name = name_no_workspace
                if not os.path.exists(f"{workspace}\\{folderspace}\\{i['asset_id']}"):
                    os.makedirs(f"{workspace}\\{folderspace}\\{i['asset_id']}")
                path = f"{workspace}\\{folderspace}\\{i['asset_id']}\\"
                #print(f"{workspace}/{folderspace}/{i['asset_id']}"+name)
                cv.imwrite(path + i['asset_id'] +name, imageblank)

        if kwargs["file"]:
            if kwargs["path"]:
                name = name_no_workspace
            if not os.path.exists(f"{workspace}/{folderspace}"):
                os.makedirs(f"{workspace}/{folderspace}")
            path = f"{workspace}/{folderspace}/"
            cv.imwrite(os.path.join(path, f'{name}.jpg'), alltogether)
        frame_count += 1

        #Show image
        if kwargs["show"]:
            cv.imshow(name, alltogether)
            cv.waitKey(0)
            cv.destroyAllWindows()

    f.close()
    if frame_count == num_frames:
        return

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--label", help="Label the image", action="store_true")
    parser.add_argument("--show", help="Show the image", action="store_true")
    parser.add_argument("--file", help="Save the image", action="store_true")
    parser.add_argument("--path", help="Path to the folder")
    args = parser.parse_args()
    dostuff(**vars(args))

This is what I got in the end, tucked away in a folder named "bboxdraw" inside whatever folder you put into the program. image image

I hope this helps xD I wont lie it was a little painful but look it! results!

Bin-ze commented 2 years ago

frist: You could also just use the Movi-F Example to generate data or as a full example in the Kubric Pipeline. I need to run this script to generate the dataset and meta.json first, and then use the processing script you wrote to complete the bbox drawing? Forgive me for asking such a question, I'm a beginner so maybe the question is stupid. From the results you gave, this is indeed what I want, such a fake object detection dataset can indeed save a lot of money, which is a great job! I still have some questions. 1. How is the scene generated? The scene here seems to be provided by the repo

  1. I followed the official example to run the code, but an error was reported: image
  2. I get mefile = "metadata.json" Is it possible to complete the data generation with the script you provided?

Because the environment is difficult to configure, it is not easy to run the code in the mirror to debug, so it is very inconvenient, but thank you very much for what you have done, I believe this can help more people

MrXandbadas commented 2 years ago

No need to request forgiveness! Just ensure you read and poise questions to better help yourself understand. Never be afraid to ask for clarification! The internet can be a great place to learn :D

I plan on and hope to answer all your questions but first want to get through the issue you seem to be having with point 2

One of the Developers Mentioned the following quoted information, Although at the time of writing my comment here; there has not been any further comments after the following:

Hmm weird. That looks like you cannot access gs://kubric-public/assets/KuBasic/KuBasic.json for some reason. Can you verify that you have internet access from your container?

docker run --rm --interactive \
  --user $(id -u):$(id -g)    \
  kubricdockerhub/kubruntu \
  curl https://storage.googleapis.com/kubric-public/assets/KuBasic/KuBasic.json

The above was in response to someone experiencing the same error as in your picture

Now lets jump back to point 1 The scene is generated inside the file referenced in the error (The one hyper-linked blue by your code editor)

docker run --rm --interactive \
  --user $(id -u):$(id -g)    \
  --volume "$(pwd):/kubric"   \
  kubricdockerhub/kubruntu    \
  /usr/bin/python3 challenges/movi/movi_def_worker.py \
  --camera=fixed_random

The last two lines of the command you copy and paste give us further insight aswell. Lets explore xD /usr/bin/python3 challenges/movi/movi_def_worker.py \ The path to the script is the first part and the backslash ("\") is so we can spread our command over multiple lines The next line is a Argument passed through to the script, if we look at the file mentioned we can find the line that defines the argument we pass through https://github.com/google-research/kubric/blob/ea9556f6f6228aa35298ca35cd6e5db38725e85f/challenges/movi/movi_def_worker.py#L52

Parkour! Point 3 In my earlier comment that I edited on the 31st I mention

This is what the buttttttt end of what a Worker File looks like inside the Kubric Pipeline

logging.info("Rendering the scene ...")
.......

If we go back to the file we activate we will find the aforementioned "end of what a Worker File looks like" https://github.com/google-research/kubric/blob/ea9556f6f6228aa35298ca35cd6e5db38725e85f/challenges/movi/movi_def_worker.py#L241 I was generalizing and I failed to mention that. Other examples do NOT look like that consistently. There are just a few different "Boilerplates" for specific needs. To which yours is a need for the Segmentation Data and the metadata for bboxs.

I want to make sure I include everything you need to get the desired result. Its a bit of a walk in the park and there might be a few more questions you have. Explore The Examples Folder of this Repo and read through how they are structured and try to find differences. Relate them to the Movi-F worker file we looked at earlier to try to get a broader understanding of the Kubric Pipeline There is quite an easy way to run these examples https://github.com/google-research/kubric/blob/ea9556f6f6228aa35298ca35cd6e5db38725e85f/Makefile#L49 make examples/helloworld - for example will start this example file of the repo

Lets now answer how do we create a dataset from the beginning in plain English?

Create a custom Kubric worker file that creates a scene using certain objects. Output the appropriate data.


I hope I was able to answer a few of your questions. If there's any more questions let me know. If you don't have internet connection from inside your container then you will struggle getting the Pipeline operational. Once it works once there isnt much trouble in getting it to function time and time again.

Bin-ze commented 2 years ago

I used the following test command as you mentioned:

docker run --rm --interactive \
              --user $(id -u):$(id -g)    \
              kubricdockerhub/kubruntu \
              curl https://storage.googleapis.com/kubric-public/assets/KuBasic/KuBasic.json

The content of the json file of the road successfully, my container can access the internet。 But visiting foreign websites in my country is not allowed, so it may be because of this that the code runs abnormally.

I would like to know if there is a way to import json locally, I will download the required files and upload them to the folder of the local server, instead of accessing the URL and downloading through python code. After I solve the problem of generating the dataset, I can proceed to try the construction of the object detection dataset with the code you provided.

MrXandbadas commented 2 years ago

I used the following test command as you mentioned:

docker run --rm --interactive \
              --user $(id -u):$(id -g)    \
              kubricdockerhub/kubruntu \
              curl https://storage.googleapis.com/kubric-public/assets/KuBasic/KuBasic.json

The content of the json file of the road successfully, my container can access the internet。 But visiting foreign websites in my country is not allowed, so it may be because of this that the code runs abnormally.

I would like to know if there is a way to import json locally, I will download the required files and upload them to the folder of the local server, instead of accessing the URL and downloading through python code. After I solve the problem of generating the dataset, I can proceed to try the construction of the object detection dataset with the code you provided.

Create a new folder somewhere and make your way in a browser to the following link

https://console.cloud.google.com/storage/browser/_details/kubric-public/assets/KuBasic.json;tab=live_object

I'm not sure how you will go about loading it inside the file for demonstrative purposes. I tried quickly and from what I could tell I wasnt sucessfull in getting it to load locally. Although you may have better luck as its your setup that has the issues.

I cobbled up a little example for you to use that should work in your Region. Its really simple but should provide a solid learning base to start getting a response from the Kubric Pipeline without downloading extra assets from inside the Repo. I think I got this from the bounching balls example. I included the function to compute bboxes and then also made sure the metadata file was made up of things we want/need. Read through it, try to get it working. I wish you luck and I hope your a step closer to understanding the pipeline.

import logging
import kubric as kb
from kubric.renderer.blender import Blender as KubricBlender
from kubric.simulator.pybullet import PyBullet as KubricSimulator
import numpy as np

logging.basicConfig(level="INFO")
print(f"executing '{__file__}' with kubric=={kb.__version__}")

# --- create scene and attach a renderer and simulator
scene = kb.Scene(resolution=(256, 256))
scene.frame_end = 4   # < numbers of frames to render
scene.frame_rate = 24  # < rendering framerate
scene.step_rate = 240  # < simulation framerate
renderer = KubricBlender(scene)
simulator = KubricSimulator(scene)

# --- populate the scene with objects, lights, cameras
scene += kb.Cube(name="floor", scale=(3, 3, 0.1), position=(0, 0, -0.1),
                 static=True)
scene += kb.DirectionalLight(name="sun", position=(-1, -0.5, 3),
                             look_at=(0, 0, 0), intensity=1.5)
scene.camera = kb.PerspectiveCamera(name="camera", position=(2, -0.5, 4),
                                    look_at=(0, 0, 0))
scene.camera.position = (6, 0, 3.0)
scene.camera.set_position((6, 0, 3.0))

# --- generates spheres randomly within a spawn region
# generate a random spawn region
spawn_region = [[-1, -1, 0], [1, 1, 1]]
rng = np.random.default_rng()
for i in range(8):
    velocity = rng.uniform([-1, -1, 0], [1, 1, 0])
    material = kb.PrincipledBSDFMaterial(color=kb.random_hue_color(rng=rng))
    sphere = kb.Sphere(scale=0.1, velocity=velocity, material=material)
    scene += sphere
    kb.move_until_no_overlap(sphere, simulator, spawn_region=spawn_region)

# --- executes the simulation (and store keyframes)
animation, collisions = simulator.run()

# --- renders the output
renderer.save_state("output/simulator.blend")
data_stack = renderer.render()

kb.compute_visibility(data_stack["segmentation"], scene.assets)
visible_foreground_assets = [asset for asset in scene.foreground_assets
                             if np.max(asset.metadata["visibility"]) > 0]
visible_foreground_assets = sorted(  # sort assets by their visibility
    visible_foreground_assets,
    key=lambda asset: np.sum(asset.metadata["visibility"]),
    reverse=True)

data_stack["segmentation"] = kb.adjust_segmentation_idxs(
    data_stack["segmentation"],
    scene.assets,
    visible_foreground_assets)
scene.metadata["num_instances"] = len(visible_foreground_assets)

kb.write_image_dict(data_stack, "output")
kb.post_processing.compute_bboxes(data_stack["segmentation"],
                                  visible_foreground_assets)

data = {
    "metadata": kb.get_scene_metadata(scene),
    "camera": kb.get_camera_info(scene.camera),
    "objects": kb.get_instance_info(scene, visible_foreground_assets)
}
kb.file_io.write_json(filename="output/" + "metadata.json", data=data)

kb.done()

replace examples/helloworld.py with the above code and run make examples/helloworld

It should produce 4 Frames worth of images.

Once again, goodluck xD

Bin-ze commented 2 years ago

I used the example you provided to generate the data, and used the script you wrote to visualize the bbox, but no valid objects were generated in this hello world.py, the reason I think is because the valid metadata is not loaded (json file containing object information), so I tried to re-use the example provided by the repo for data generation, but as before, I still can't get the json file from the url, how can I solve this problem?

MrXandbadas commented 2 years ago

Hi there, My silly little copy and paste brain found the issue with the above example I Provided.

Lets look at why it did not work..... The BBox Drawing Script I gave you Looks for "Instances" inside the metadata.json file. The example I gave you has the instances as "objects" instead A quick little fix in the definition of the dictionary in our Worker File and Volla! We now have solved the issue inside of the Metadata file that was throwing the first error Old:

data = {
    "metadata": kb.get_scene_metadata(scene),
    "camera": kb.get_camera_info(scene.camera),
    "objects": kb.get_instance_info(scene, visible_foreground_assets)
}

New:

data = {
    "metadata": kb.get_scene_metadata(scene),
    "camera": kb.get_camera_info(scene.camera),
    "instances": kb.get_instance_info(scene, visible_foreground_assets)
}

Another error, oh my! Turn out the Asset does not have an ID. Well lets implement a fix for that, as a backup we use the Segmentation_ID (I think its broken from what I've read. Ill test some more) BUT Lets add another case incase that fails and we will use a locally created ID of numbering the suckers xD

Because I like Copy and Paste here is the source code for the BBox Drawer again Can you find what was changed?

import json
import os
import cv2 as cv
import argparse

mefile = "metadata.json"
filestarter = "rgba_"
endpng = ".png"
folderspace = "bboxdraw"

def dostuff(**kwargs):
    workspace = None
    mepng = "0000"
    if kwargs["path"]:
        workspace = kwargs["path"]
        mefile = str(workspace) + "/metadata.json"
    if workspace is None:
        workspace = "."

    f = open(mefile)
    data = json.load(f)
    height, width = data["metadata"]["resolution"]
    num_frames = data["metadata"]["num_frames"]
    frame_count = 0

    while frame_count < num_frames:
        print(frame_count)
        if frame_count >= 10:
            mepng = "000"

        elif frame_count >= 100:
            mepng = "00"
        if kwargs["path"]:
            if filestarter:
                name = str(workspace) + "/" + filestarter + mepng + str(frame_count) + endpng
                name_no_workspace = filestarter + mepng + str(frame_count) + endpng
            else:
                name = str(workspace) + "/" + mepng + str(frame_count) + endpng
                name_no_workspace = mepng + str(frame_count) + endpng
        else:
            if filestarter:
                name = filestarter + mepng + str(frame_count) + endpng
                print(name)
            else:
                name = mepng + str(frame_count) + endpng
                print(name)

        # Open the image
        image = cv.imread(name)
        # Make a Copy
        alltogether = image.copy()
        number_item = 0

        for i in data["instances"]:
            if(number_item >= len(data["instances"])):
                #reset the number
                number_item = 0

            imageblank = image.copy()
            print(f"Hey " + str(len(i["bbox_frames"])))
            try:
                ouritem = i["bbox_frames"]
                if ouritem[frame_count]:
                    print(f"YES for {frame_count}")
                if i["bbox_frames"][frame_count] is frame_count:
                    item = i["bboxes"][frame_count]
                y_min = int(item[0] * height)
                x_min = int(item[1] * width)
                y_max = int(item[2] * height)
                x_max = int(item[3] * width)

            except Exception as e:
                print(f"[EXCEPTION] {e}")
            finally:

                # you need top-left corner and bottom-right corner of rectangle
                cv.rectangle(alltogether, (x_min, y_min), (x_max, y_max), (255, 0, 0))
                cv.rectangle(imageblank, (x_min, y_min), (x_max, y_max), (255, 0, 0))
            #Label image
            if kwargs["label"]:
                if(hasattr(i,"asset_id")):
                    cv.putText(alltogether, i["asset_id"], (x_min, y_min), cv.FONT_HERSHEY_SIMPLEX, 0.3, (0, 255, 0))
                    cv.putText(imageblank, i["asset_id"], (x_min, y_min), cv.FONT_HERSHEY_SIMPLEX, 0.3, (0, 255, 0))
                elif(hasattr(i, "segmentation_id")):
                    cv.putText(alltogether, i["segmentation_id"], (x_min, y_min), cv.FONT_HERSHEY_SIMPLEX, 0.3, (0, 255, 0))
                    cv.putText(imageblank, i["segmentation_id"], (x_min, y_min), cv.FONT_HERSHEY_SIMPLEX, 0.3, (0, 255, 0))
                else:
                    #use the number
                    cv.putText(alltogether, str(number_item), (x_min, y_min), cv.FONT_HERSHEY_SIMPLEX, 0.3, (0, 255, 0))
                    cv.putText(imageblank, str(number_item), (x_min, y_min), cv.FONT_HERSHEY_SIMPLEX, 0.3, (0, 255, 0))

            #Save image
            if kwargs["file"]:

                if not os.path.exists(f"{workspace}\\{folderspace}"):
                    os.makedirs(f"{workspace}\\{folderspace}")
                if kwargs["path"]:
                    name = name_no_workspace
                if(hasattr(i,"asset_id")):
                    if not os.path.exists(f"{workspace}\\{folderspace}\\{i['asset_id']}"):
                        os.makedirs(f"{workspace}\\{folderspace}\\{i['asset_id']}")
                    path = f"{workspace}\\{folderspace}\\{i['asset_id']}\\"
                    #print(f"{workspace}/{folderspace}/{i['asset_id']}"+name)
                    cv.imwrite(path + i['asset_id'] +name, imageblank)
                elif(hasattr(i, "segmentation_id")):
                    if not os.path.exists(f"{workspace}\\{folderspace}\\{i['segmentation_id']}"):
                        os.makedirs(f"{workspace}\\{folderspace}\\{i['segmentation_id']}")
                    path = f"{workspace}\\{folderspace}\\{i['segmentation_id']}\\"
                    #print(f"{workspace}/{folderspace}/{i['asset_id']}"+name)
                    cv.imwrite(path + i['segmentation_id'] +name, imageblank)
                else:
                    if not os.path.exists(f"{workspace}\\{folderspace}\\{str(number_item)}"):
                        os.makedirs(f"{workspace}\\{folderspace}\\{str(number_item)}")
                    path = f"{workspace}\\{folderspace}\\{str(number_item)}\\"
                    #print(f"{workspace}/{folderspace}/{i['asset_id']}"+name)
                    cv.imwrite(path + str(number_item) + name, imageblank)

            number_item += 1
        if kwargs["file"]:
            if kwargs["path"]:
                name = name_no_workspace
            if not os.path.exists(f"{workspace}/{folderspace}"):
                os.makedirs(f"{workspace}/{folderspace}")
            path = f"{workspace}/{folderspace}/"
            cv.imwrite(os.path.join(path, f'{name}.jpg'), alltogether)
        frame_count += 1

        #Show image
        if kwargs["show"]:
            cv.imshow(name, alltogether)
            cv.waitKey(0)
            cv.destroyAllWindows()

    f.close()
    if frame_count == num_frames:
        return

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--label", help="Label the image", action="store_true")
    parser.add_argument("--show", help="Show the image", action="store_true")
    parser.add_argument("--file", help="Save the image", action="store_true")
    parser.add_argument("--path", help="Path to the folder")
    args = parser.parse_args()
    dostuff(**vars(args))

# Here is a description of what this file does:
# This file is used to draw bounding boxes on the images.
# It is used to draw bounding boxes on the images.

I went the extra mile and tested it aswell. 4rgba_00000 rgba_00001 png

The bbox draw script I wrote wont do everything for you. Your going to need to hit those keys and watch some videos. I've now made the appropriate edits for it to work with the all examples given in this thread 😄 You will need to edit it to your specific needs.


but as before, I still can't get the json file from the url, how can I solve this problem?

That is an excellent Question I wish I could help with. I Simply don't know

Bin-ze commented 2 years ago

Actually I discovered the problem you mentioned early on and fixed it by replacing 'instance' with 'objects', but that's not the point. The real problem is that I don't generate any valid objects at all using helloworld, it's the same for every image, here's what I generate: image

I commented out a line in the code you provided: image

Because there will be an error here, the reason for the error is that this class instance does not support such a method, I'm not sure how you got it working?

MrXandbadas commented 2 years ago

OH! Crikey. I really do have a copy paste brain...

Lets find the Problems!!!!!

To make debugging easier you might want to download yourself Blender 2.9 so you can open the .blend file rendered in the following line

# --- renders the output
renderer.save_state("output/simulator.blend")

Opening the blender file you will now see the camera is wonky - keep reading for an explanation or skip to the bottom of this comment 😄

The class does not support such method as it was me testing/attempting to fix some things. I got it working by editing the Kubric Core Code.

How did I get that working? Good question!

I've created a Pull Request although things take time xD https://github.com/google-research/kubric/pull/238/commits/c735f1aead8ed75d14e8677960d3def83bcd2ac8

Here is my Fork that I do ALL my Kubric stuffs on. https://github.com/MrXandbadas/kubric/tree/KubricCore

Here we are at the funky camera issues listed in the repository Issues section: https://github.com/google-research/kubric/issues/197 https://github.com/google-research/kubric/pull/198 All completely viewable from the repo. Just need to do some digging, reading and a bit o button mashing 😁 You are now probably going to want to be able to run the code from your local files instead of the Kubirc Docker Image. (If you implemented the fix yourself): https://github.com/google-research/kubric/discussions/193


ORRRRRRRR!!!!!!! Remove the following Lines:

scene.camera.position = (6, 0, 3.0)
scene.camera.set_position((6, 0, 3.0))

Delete both those lines in your worker file. You dont need them, You only need the one above it for this example as we dont need a moving camera at this point (Keep the following):

scene.camera = kb.PerspectiveCamera(name="camera", position=(2, -0.5, 4),
                                    look_at=(0, 0, 0))

The camera gets moved here in an earlier attempt to replicate the above Linked Issues. My mistake to leave it in there D:

EDIT:

but that's not the point. The real problem is that I don't generate any valid objects at all using helloworld

Why arnt you getting any object or instance information in the metadata.json file?

kb.compute_visibility(data_stack["segmentation"], scene.assets)
visible_foreground_assets = [asset for asset in scene.foreground_assets
                             if np.max(asset.metadata["visibility"]) > 0]
visible_foreground_assets = sorted(  # sort assets by their visibility
    visible_foreground_assets,
    key=lambda asset: np.sum(asset.metadata["visibility"]),
    reverse=True)

data_stack["segmentation"] = kb.adjust_segmentation_idxs(
    data_stack["segmentation"],
    scene.assets,
    visible_foreground_assets)
scene.metadata["num_instances"] = len(visible_foreground_assets)

kb.write_image_dict(data_stack, "output")
kb.post_processing.compute_bboxes(data_stack["segmentation"],
                                  visible_foreground_assets)

data = {
    "metadata": kb.get_scene_metadata(scene),
    "camera": kb.get_camera_info(scene.camera),
    "objects": kb.get_instance_info(scene, visible_foreground_assets)
}

As we can see from the snippet of the worker file It only grabs the VISIBLE Objects. The camera isnt pointing at anything but the edge of the box named "floor". It is not a Asset Object so it has no asset_ID, Its a Kubric Core object as are the balls.

Also please note that is NOT the only fix required inside the bbox drawer file as at this point in time, Segmentation_ID isn't working as intended. Which is why I updated the script to deafult to asset_id then segmentation_id and finally a numbering system if all else fails.

https://github.com/MrXandbadas/kubric/commit/72415eae2c5e39a9fc6d762741188ee87d1c6301 is the specific commit to my Fork of this repository with a potential fix for the Segmentation_ID issue. Allowing for Segmentation_id's to be passed through to the metadata.json file. I'm going to guess thats how I got it working...

There might be one more thing I'm forgetting to mention.... It will come to me xD