taconite / arah-release

[ECCV 2022] ARAH: Animatable Volume Rendering of Articulated Human SDFs
https://neuralbodies.github.io/arah/
MIT License
182 stars 15 forks source link

Run on custom dataset #28

Open yejr0229 opened 1 year ago

yejr0229 commented 1 year ago

When I train arah in my custom dataset,this error occur: valid_inds = np.where(mask_at_box[:self.num_fg_samples + 1024] == 1)[0] fg_inds = np.random.choice(valid_inds.shape[0], size=self.num_fg_samples, replace=False) ValueError: Cannot take a larger sample than population when 'replace=False' But when debuging it's all fine,I guess it's because some intersections between ratys and SMPL bounding box whose near is bigger than far,so how can I deal with that?

taconite commented 1 year ago

The error should be caused by the fact that valid_inds.shape[0] is smaller than self.num_fg_samples. This is usually due to the number of rays that intersect with the SMPL bounding box being very small or no intersection at all. Usually, this is caused by the wrong transformation of either the cameras or the SMPL meshes...

yejr0229 commented 12 months ago

I have solve this problem,thank you so much. But when I validate the model using this script:python validate.py --novel-view --num-workers 4 ${path_to_config},I encounter CUDA out of memory,I am using a 3090 GPU,and it's all fine during training,how can I solve this?

yejr0229 commented 12 months ago

Is it because the memory was not cleaned during validating?

taconite commented 12 months ago

You can try reducing point_batch_size to smaller number here. By default low_vram is set to False.

Previously there were some OOM issues on large output images, which should be fixed by the commit 1b20afe787a10957c122ee0f4779d196612ca906.

Overall OOM issues are caused by GPU memory not being released for previous batch inference - somehow this happens even with torch.no_grad(). If reducing the point_batch_size does not work for you, try changing torch.no_grad() to torch.inference_mode() here, here or here

yejr0229 commented 11 months ago

I have set low_vram is to True and fixed by the commit,but my pytorch version is 1.8.1 so I don't have torch.inference_mode(). But now another problem come out: Traceback (most recent call last): File "validate.py", line 107, in trainer.validate(model=model, dataloaders=val_loader, ckpt_path=checkpoint_path, verbose=True) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 816, in validate return self._call_and_handle_interrupt(self._validate_impl, model, dataloaders, ckpt_path, verbose, datamodule) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 682, in _call_and_handle_interrupt return trainer_fn(*args, kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 859, in _validate_impl results = self._run(model, ckpt_path=self.validated_ckpt_path) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1194, in _run self._dispatch() File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1270, in _dispatch self.training_type_plugin.start_evaluating(self) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 206, in start_evaluating self._results = trainer.run_stage() File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1281, in run_stage return self._run_evaluate() File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1329, in _run_evaluate eval_loop_results = self._evaluation_loop.run() File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 145, in run self.advance(*args, *kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 110, in advance dl_outputs = self.epoch_loop.run(dataloader, dataloader_idx, dl_max_batches, self.num_dataloaders) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 145, in run self.advance(args, kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 122, in advance output = self._evaluation_step(batch, batch_idx, dataloader_idx) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 217, in _evaluation_step output = self.trainer.accelerator.validation_step(step_kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 236, in validation_step return self.training_type_plugin.validation_step(step_kwargs.values()) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 219, in validation_step return self.model.validation_step(args, kwargs) File "/home/pth-algo/Code/arah-release/im2mesh/metaavatar_render/lightning_model.py", line 180, in validation_step model_outputs = self.model(inputs, gen_cano_mesh=True, eval=True) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/home/pth-algo/Code/arah-release/im2mesh/metaavatar_render/models/init.py", line 224, in forward return_w=False File "/home/pth-algo/Code/arah-release/im2mesh/utils/root_finding_utils.py", line 159, in forward_skinning w = query_weights(x_hat, loc, sc_factor, coord_min, coord_max, center, skinning_model, vol_feat, mask=mask, point_batch_size=point_batch_size) File "/home/pth-algo/Code/arah-release/im2mesh/utils/root_finding_utils.py", line 94, in query_weights wi = skinning_model.decode_w(pi, c=torch.empty(pi.size(0), 0, device=pi.device, dtype=torch.float32), forward=True) File "/home/pth-algo/Code/arah-release/im2mesh/metaavatar_render/models/skinning_model.py", line 31, in decode_w pts_W = self.skinning_decoder_fwd(p, c=c, kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/home/pth-algo/Code/arah-release/im2mesh/metaavatar/models/decoder.py", line 233, in forward return out.reshape(batch_size, n_pts, -1) RuntimeError: cannot reshape tensor of 0 elements into shape [1, 0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

yejr0229 commented 11 months ago

I set gen_cano_mesh to True,because my dataset have scan mesh data,during training it's all fine. I find mesh vertices and faces is empty during validation at this step: verts, faces = sdf_meshing.create_mesh_vertices_and_faces(sdf_decoder, N=256, max_batch=64 ** 3)

yejr0229 commented 11 months ago

But even if I reset gen_cano_mesh to false,the OOM problem still happends...

taconite commented 11 months ago

But even if I reset gen_cano_mesh to false,the OOM problem still happends...

At this point, I guess I need some additional information to figure this out... what is the resolution of your custom dataset? Can you post your OOM error here? Also, it would be best if you could provide a checkpoint and example data for reproducing the error...

taconite commented 11 months ago

I have set low_vram is to True and fixed by the commit,but my pytorch version is 1.8.1 so I don't have torch.inference_mode(). But now another problem come out: Traceback (most recent call last): File "validate.py", line 107, in trainer.validate(model=model, dataloaders=val_loader, ckpt_path=checkpoint_path, verbose=True) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 816, in validate return self._call_and_handle_interrupt(self._validate_impl, model, dataloaders, ckpt_path, verbose, datamodule) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 682, in _call_and_handle_interrupt return trainer_fn(*args, kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 859, in _validate_impl results = self._run(model, ckpt_path=self.validated_ckpt_path) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1194, in _run self._dispatch() File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1270, in _dispatch self.training_type_plugin.start_evaluating(self) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 206, in start_evaluating self._results = trainer.run_stage() File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1281, in run_stage return self._run_evaluate() File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1329, in _run_evaluate eval_loop_results = self._evaluation_loop.run() File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 145, in run self.advance(*args, *kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 110, in advance dl_outputs = self.epoch_loop.run(dataloader, dataloader_idx, dl_max_batches, self.num_dataloaders) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 145, in run self.advance(args, kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 122, in advance output = self._evaluation_step(batch, batch_idx, dataloader_idx) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 217, in _evaluation_step output = self.trainer.accelerator.validation_step(step_kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 236, in validation_step return self.training_type_plugin.validation_step(step_kwargs.values()) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 219, in validation_step return self.model.validation_step(args, kwargs) File "/home/pth-algo/Code/arah-release/im2mesh/metaavatar_render/lightning_model.py", line 180, in validation_step model_outputs = self.model(inputs, gen_cano_mesh=True, eval=True) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/home/pth-algo/Code/arah-release/im2mesh/metaavatar_render/models/init.py", line 224, in forward return_w=False File "/home/pth-algo/Code/arah-release/im2mesh/utils/root_finding_utils.py", line 159, in forward_skinning w = query_weights(x_hat, loc, sc_factor, coord_min, coord_max, center, skinning_model, vol_feat, mask=mask, point_batch_size=point_batch_size) File "/home/pth-algo/Code/arah-release/im2mesh/utils/root_finding_utils.py", line 94, in query_weights wi = skinning_model.decode_w(pi, c=torch.empty(pi.size(0), 0, device=pi.device, dtype=torch.float32), forward=True) File "/home/pth-algo/Code/arah-release/im2mesh/metaavatar_render/models/skinning_model.py", line 31, in decode_w pts_W = self.skinning_decoder_fwd(p, c=c, kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/home/pth-algo/Code/arah-release/im2mesh/metaavatar/models/decoder.py", line 233, in forward return out.reshape(batch_size, n_pts, -1) RuntimeError: cannot reshape tensor of 0 elements into shape [1, 0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

seems like a tensor of 0 size is passed into the skinning module which gives the error. Can you break at "/home/pth-algo/Code/arah-release/im2mesh/metaavatar_render/models/init.py:224" and see what is causing the script to pass 0-sized tensor into the function?

yejr0229 commented 11 months ago

I have set low_vram is to True and fixed by the commit,but my pytorch version is 1.8.1 so I don't have torch.inference_mode(). But now another problem come out: Traceback (most recent call last): File "validate.py", line 107, in trainer.validate(model=model, dataloaders=val_loader, ckpt_path=checkpoint_path, verbose=True) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 816, in validate return self._call_and_handle_interrupt(self._validate_impl, model, dataloaders, ckpt_path, verbose, datamodule) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 682, in _call_and_handle_interrupt return trainer_fn(*args, kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 859, in _validate_impl results = self._run(model, ckpt_path=self.validated_ckpt_path) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1194, in _run self._dispatch() File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1270, in _dispatch self.training_type_plugin.start_evaluating(self) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 206, in start_evaluating self._results = trainer.run_stage() File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1281, in run_stage return self._run_evaluate() File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1329, in _run_evaluate eval_loop_results = self._evaluation_loop.run() File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 145, in run self.advance(*args, *kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 110, in advance dl_outputs = self.epoch_loop.run(dataloader, dataloader_idx, dl_max_batches, self.num_dataloaders) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 145, in run self.advance(args, kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 122, in advance output = self._evaluation_step(batch, batch_idx, dataloader_idx) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 217, in _evaluation_step output = self.trainer.accelerator.validation_step(step_kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 236, in validation_step return self.training_type_plugin.validation_step(step_kwargs.values()) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 219, in validation_step return self.model.validation_step(args, kwargs) File "/home/pth-algo/Code/arah-release/im2mesh/metaavatar_render/lightning_model.py", line 180, in validation_step model_outputs = self.model(inputs, gen_cano_mesh=True, eval=True) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/home/pth-algo/Code/arah-release/im2mesh/metaavatar_render/models/init.py", line 224, in forward return_w=False File "/home/pth-algo/Code/arah-release/im2mesh/utils/root_finding_utils.py", line 159, in forward_skinning w = query_weights(x_hat, loc, sc_factor, coord_min, coord_max, center, skinning_model, vol_feat, mask=mask, point_batch_size=point_batch_size) File "/home/pth-algo/Code/arah-release/im2mesh/utils/root_finding_utils.py", line 94, in query_weights wi = skinning_model.decode_w(pi, c=torch.empty(pi.size(0), 0, device=pi.device, dtype=torch.float32), forward=True) File "/home/pth-algo/Code/arah-release/im2mesh/metaavatar_render/models/skinning_model.py", line 31, in decode_w pts_W = self.skinning_decoder_fwd(p, c=c, kwargs) File "/home/pth-algo/anaconda3/envs/pt181_cu111/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/home/pth-algo/Code/arah-release/im2mesh/metaavatar/models/decoder.py", line 233, in forward return out.reshape(batch_size, n_pts, -1) RuntimeError: cannot reshape tensor of 0 elements into shape [1, 0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

seems like a tensor of 0 size is passed into the skinning module which gives the error. Can you break at "/home/pth-algo/Code/arah-release/im2mesh/metaavatar_render/models/init.py:224" and see what is causing the script to pass 0-sized tensor into the function?

Thanks for replying.This problem is caused by the version of scikit-image,I change it to 0.19.3 and it works

yejr0229 commented 11 months ago

By the way,I wonder how to change the render resolution in ARAH? Is changing the imge_size here right? The following code is in /arah-release/im2mesh/metaavatar_render/models/init.py:

raster_settings = RasterizationSettings( image_size=(512, 512), )

cam_rot = inputs['cam_rot'] cam_trans = inputs['cam_trans'] K = inputs['intrinsics']

image_size = torch.tensor([[512, 512]], dtype=torch.float32, device=device) cameras = cameras_from_opencv_projection(cam_rot, cam_trans, K, image_size).to(device)

rasterizer = MeshRasterizer(cameras=cameras, raster_settings=raster_settings) rendered = rasterizer(mesh_bar)

yejr0229 commented 11 months ago

After validation,I found my output images are randomly named by wandb,so I can't know the specific camera and frame about the predicted image,how can I change this? the random named image is like: validation_samples_512_93d1482d54bbbdfb71a6.png validation_samples_513_2529cec793afc039e3c8.png

taconite commented 11 months ago

By the way,I wonder how to change the render resolution in ARAH? Is changing the imge_size here right? The following code is in /arah-release/im2mesh/metaavatar_render/models/init.py:

raster_settings = RasterizationSettings( image_size=(512, 512), )

cam_rot = inputs['cam_rot'] cam_trans = inputs['cam_trans'] K = inputs['intrinsics']

image_size = torch.tensor([[512, 512]], dtype=torch.float32, device=device) cameras = cameras_from_opencv_projection(cam_rot, cam_trans, K, image_size).to(device)

rasterizer = MeshRasterizer(cameras=cameras, raster_settings=raster_settings) rendered = rasterizer(mesh_bar)

The resolution here is for rendering the extracted mesh only. If you want to change the resolution for the actual volume rendering, you need to modify the dataloader

taconite commented 11 months ago

After validation,I found my output images are randomly named by wandb,so I can't know the specific camera and frame about the predicted image,how can I change this? the random named image is like: validation_samples_512_93d1482d54bbbdfb71a6.png validation_samples_513_2529cec793afc039e3c8.png

That is the problem with wandb - it does not save images in human-readable form.

I have a script to extract the images captioned rgb_pred from wandb logs. But I would really recommend you refer to the test code which saves images to disk without using wandb.

import os
import shutil
import json
import glob
import argparse
import numpy as np

from wandb.proto import wandb_internal_pb2
from wandb.sdk.internal import datastore

# Arguments
parser = argparse.ArgumentParser(
    description='Synchronize offline wandb runs to wandb cloud.'
)
parser.add_argument('--config', type=str, default='',
                    help='Path to config file. In case of logdir is not specified, will read the output dir specified by the config file.')
parser.add_argument('--logdir', type=str, default='',
                    help='If specified, will override the output dir specified in the config file.')
parser.add_argument('--outdir', type=str, default='',
                    help='Output directory to store extracted images.')

# It's also helpful to refer to this issue: https://github.com/wandb/wandb/issues/1768

if __name__ == '__main__':
    args = parser.parse_args()

    if len(args.logdir) <= 0 and len(args.config):
        raise ValueError('Either logdir or config needs to be non-empty')

    if len(args.logdir) > 0:
        out_dir = args.logdir
    else:
        from im2mesh import config
        cfg = config.load_config(args.config, 'configs/default.yaml')
        out_dir = cfg['training']['out_dir']

    # online_data_paths = sorted(glob.glob(os.path.join('/home/sfwang/Remote_outputs/arah-custom', log_folder, 'wandb/run-*/run-*.wandb')))
    # offline_data_paths = sorted(glob.glob(os.path.join(out_dir, 'wandb/offline-run-*/run-*.wandb')))
    # offline_data_paths = sorted(glob.glob(os.path.join(out_dir, 'wandb/latest-run/run-*.wandb')))
    all_data_paths = sorted(glob.glob(os.path.join(out_dir, 'wandb/run-*')))
    offline_data_paths = sorted(glob.glob(os.path.join(all_data_paths[-2], 'run-*.wandb')))
    # all_data_paths = online_data_paths + offline_data_paths
    all_data_paths = offline_data_paths

    count = 0

    if not os.path.exists(args.outdir):
        os.makedirs(args.outdir)

    for data_path in all_data_paths:
        ds = datastore.DataStore()
        ds.open_for_scan(data_path)

        imgs = []

        while True:
            data = ds.scan_data()
            if data is None:
                break

            pb = wandb_internal_pb2.Record()
            pb.ParseFromString(data)
            record_type = pb.WhichOneof("record_type")
            if record_type == "history":
                for item in pb.history.item:
                    if item.key == 'validation_samples':
                        val_dict = json.loads(item.value_json)
                        # print (val_dict.keys())
                        # wandb.log({
                        #     "validation_samples":
                        #         [wandb.Image(os.path.join(os.path.dirname(data_path),
                        #             'files', img), caption=caption)
                        #             for img, caption in zip(imgs, val_dict['captions'])
                        #         ]
                        #     }
                        # )
                        for img, caption in zip(imgs, val_dict['captions']):
                            if caption == 'rgb_pred':
                                dir_path = os.path.dirname(os.path.realpath(data_path))
                                shutil.copyfile(os.path.join(dir_path, 'files', img), os.path.join(args.outdir, "{:06d}.png".format(count)))
                                count += 1

                        imgs = []

            elif record_type == "files":
                for item in pb.files.files:
                    if 'media' in item.path:
                        imgs.append(item.path)

    # run_id = os.path.basename(offline_data_paths[0]).split('.')[0][4:]
    # new_id = wandb.util.generate_id()
    # wandb_init_kwargs = {'id': run_id, 'resume': must}
    # wandb.init(project="arah", id=run_id, resume="must")
jiangguangan commented 1 week ago

When I train arah in my custom dataset,this error occur: valid_inds = np.where(mask_at_box[:self.num_fg_samples + 1024] == 1)[0] fg_inds = np.random.choice(valid_inds.shape[0], size=self.num_fg_samples, replace=False) ValueError: Cannot take a larger sample than population when 'replace=False' But when debuging it's all fine,I guess it's because some intersections between ratys and SMPL bounding box whose near is bigger than far,so how can I deal with that?

exc,Can you tell me a complete process of working with a custom dataset?