google / nerfactor

Neural Factorization of Shape and Reflectance Under an Unknown Illumination
https://xiuming.info/projects/nerfactor/
Apache License 2.0
440 stars 56 forks source link

"ValueError: 'a' cannot be empty unless no samples are taken" in preparation step #5

Closed cjw531 closed 3 years ago

cjw531 commented 3 years ago

Hi, I was trying to follow the first step here to get BRDF priors, but I am getting the following error:

$ REPO_DIR="$repo_dir" "$repo_dir/nerfactor/trainvali_run.sh" "$gpus" --config='brdf.ini' --config_override="data_root=$data_root,outroot=$outroot,viewer_prefix=$viewer_prefix"

2021-08-18 12:21:52.126973: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-08-18 12:21:52.165148: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:68:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2021-08-18 12:21:52.165428: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-08-18 12:21:52.171059: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-08-18 12:21:52.173348: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-08-18 12:21:52.173750: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-08-18 12:21:52.176352: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-08-18 12:21:52.186737: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-08-18 12:21:52.192472: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-08-18 12:21:52.194714: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-08-18 12:21:52.195193: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2021-08-18 12:21:52.203031: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3299990000 Hz
2021-08-18 12:21:52.203763: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f68dc000b60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-08-18 12:21:52.203782: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-08-18 12:21:52.290736: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x564ac481a330 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-08-18 12:21:52.290795: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
2021-08-18 12:21:52.292470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:68:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2021-08-18 12:21:52.292553: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-08-18 12:21:52.292584: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-08-18 12:21:52.292611: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-08-18 12:21:52.292637: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-08-18 12:21:52.292663: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-08-18 12:21:52.292690: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-08-18 12:21:52.292717: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-08-18 12:21:52.294454: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-08-18 12:21:52.294508: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-08-18 12:21:52.295132: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-08-18 12:21:52.295142: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 
2021-08-18 12:21:52.295148: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
2021-08-18 12:21:52.296183: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10150 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:68:00.0, compute capability: 7.5)
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
I0818 12:21:52.299249 140098729092928 mirrored_strategy.py:500] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
[util/io] Output directory already exisits:
    /home/jiwonchoi/nerfactor/output/train/merl/lr1e-2
[util/io] Overwrite is off, so doing nothing
[trainvali] For results, see:
    /home/jiwonchoi/nerfactor/output/train/merl/lr1e-2
Traceback (most recent call last):
  File "/home/jiwonchoi/nerfactor/nerfactor/trainvali.py", line 341, in <module>
    app.run(main)
  File "/home/jiwonchoi/.conda/envs/nerfactor/lib/python3.6/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/home/jiwonchoi/.conda/envs/nerfactor/lib/python3.6/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/home/jiwonchoi/nerfactor/nerfactor/trainvali.py", line 81, in main
    dataset_train = Dataset(config, 'train', debug=FLAGS.debug)
  File "/home/jiwonchoi/nerfactor/nerfactor/datasets/brdf_merl.py", line 52, in __init__
    mats = np.random.choice(self.brdf_names, n_iden, replace=False)
  File "mtrand.pyx", line 908, in numpy.random.mtrand.RandomState.choice
ValueError: 'a' cannot be empty unless no samples are taken

I double checked my paths. Not sure where this error has originated from.

xiumingzhang commented 3 years ago

This error is because your self.brdf_names is an empty list. Have you checked if this glob:

https://github.com/google/nerfactor/blob/main/nerfactor/datasets/brdf_merl.py#L34 returns you the expected list of files?

On Wed, Aug 18, 2021 at 1:23 PM Jiwon Choi @.***> wrote:

Hi, I was trying to follow the first step here https://github.com/google/nerfactor/tree/main/nerfactor#preparation, but I am getting the following error:

$ REPO_DIR="$repo_dir" "$repo_dir/nerfactor/trainvali_run.sh" "$gpus" --config='brdf.ini' --config_override="data_root=$data_root,outroot=$outroot,viewer_prefix=$viewer_prefix"

2021-08-18 12:21:52.126973: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2021-08-18 12:21:52.165148: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: pciBusID: 0000:68:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5 coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s 2021-08-18 12:21:52.165428: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2021-08-18 12:21:52.171059: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2021-08-18 12:21:52.173348: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2021-08-18 12:21:52.173750: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2021-08-18 12:21:52.176352: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2021-08-18 12:21:52.186737: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2021-08-18 12:21:52.192472: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2021-08-18 12:21:52.194714: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0 2021-08-18 12:21:52.195193: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA 2021-08-18 12:21:52.203031: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3299990000 Hz 2021-08-18 12:21:52.203763: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f68dc000b60 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2021-08-18 12:21:52.203782: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2021-08-18 12:21:52.290736: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x564ac481a330 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2021-08-18 12:21:52.290795: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5 2021-08-18 12:21:52.292470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: pciBusID: 0000:68:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5 coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s 2021-08-18 12:21:52.292553: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2021-08-18 12:21:52.292584: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2021-08-18 12:21:52.292611: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2021-08-18 12:21:52.292637: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2021-08-18 12:21:52.292663: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2021-08-18 12:21:52.292690: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2021-08-18 12:21:52.292717: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2021-08-18 12:21:52.294454: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0 2021-08-18 12:21:52.294508: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2021-08-18 12:21:52.295132: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-08-18 12:21:52.295142: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0 2021-08-18 12:21:52.295148: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N 2021-08-18 12:21:52.296183: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10150 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:68:00.0, compute capability: 7.5) INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',) I0818 12:21:52.299249 140098729092928 mirrored_strategy.py:500] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',) [util/io] Output directory already exisits: /home/jiwonchoi/nerfactor/output/train/merl/lr1e-2 [util/io] Overwrite is off, so doing nothing [trainvali] For results, see: /home/jiwonchoi/nerfactor/output/train/merl/lr1e-2 Traceback (most recent call last): File "/home/jiwonchoi/nerfactor/nerfactor/trainvali.py", line 341, in app.run(main) File "/home/jiwonchoi/.conda/envs/nerfactor/lib/python3.6/site-packages/absl/app.py", line 312, in run _run_main(main, args) File "/home/jiwonchoi/.conda/envs/nerfactor/lib/python3.6/site-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "/home/jiwonchoi/nerfactor/nerfactor/trainvali.py", line 81, in main dataset_train = Dataset(config, 'train', debug=FLAGS.debug) File "/home/jiwonchoi/nerfactor/nerfactor/datasets/brdf_merl.py", line 52, in init mats = np.random.choice(self.brdf_names, n_iden, replace=False) File "mtrand.pyx", line 908, in numpy.random.mtrand.RandomState.choice ValueError: 'a' cannot be empty unless no samples are taken

I double checked my paths. Not sure where this error has originated from.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google/nerfactor/issues/5, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXICPS5BKQS4EQ3QAOH7GDT5PUCNANCNFSM5CMP4AEQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

cjw531 commented 3 years ago

I tried to recreate the *.npz dataset, with train/val/test split. However, it only creates test.npz and gives this reshape error as follows:

$ REPO_DIR="$repo_dir" "$repo_dir"/data_gen/merl/make_dataset_run.sh "$indir" "$ims" "$outdir"
Training & Validation:   0%|                                                                       | 0/5 [00:00<?, ?it/s]Loading MERL-BRDF:  /Users/jchoi/workspace/nerfactor/data/brdf_merl/Copyright_Notice.txt
Training & Validation:   0%|                                                                       | 0/5 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/Users/jchoi/workspace/nerfactor/data_gen/merl/make_dataset.py", line 144, in <module>
    app.run(main)
  File "/Users/jchoi/miniconda3/envs/nerfactor/lib/python3.6/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/Users/jchoi/miniconda3/envs/nerfactor/lib/python3.6/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/Users/jchoi/workspace/nerfactor/data_gen/merl/make_dataset.py", line 75, in main
    brdf = MERL(path=path)
  File "/Users/jchoi/workspace/nerfactor/brdf/merl/merl.py", line 31, in __init__
    cube_rgb = merl.readMERLBRDF(path) # (phi_d, theta_h, theta_d, ch)
  File "/Users/jchoi/workspace/nerfactor/third_party/nielsen2015on/merlFunctions.py", line 19, in readMERLBRDF
    BRDFVals = np.swapaxes(np.reshape(vals,(dims[2], dims[1], dims[0], 3),'F'),1,2)
  File "<__array_function__ internals>", line 6, in reshape
  File "/Users/jchoi/miniconda3/envs/nerfactor/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 299, in reshape
    return _wrapfunc(a, 'reshape', newshape, order=order)
  File "/Users/jchoi/miniconda3/envs/nerfactor/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 58, in _wrapfunc
    return bound(*args, **kwds)
ValueError: cannot reshape array of size 144 into shape (808591476,1751607666,2037411651,3)

I believe the directory where I have saved the donwloaded BRDF dataset does not have issue since it creates the test numpy array. It seems like someone opened same issue before, not sure how he resolved it by deleting readme inside the downloaded brdf folder.


UPDATE: I got this message, but no train.npz nor validation.npz created.

Training & Validation: 0it [00:00, ?it/s]
xiumingzhang commented 3 years ago

Hi, I suspect the BRDF data were not even successfully loaded. Could you try inserting a breakpoint right before nerfactor/third_party/nielsen2015on/merlFunctions.py L19? What is vals at that point? If vals is empty there, then that explains the errors you had.

cjw531 commented 3 years ago

Fixed this issue by re-checking the *.binary merl dataset. Below is how I solved this issue and load the T/V/T set in a proper way.

TL;DR> The provided path is incorrect in the README, if you are using the downloaded MERL dataset directly, you have to modify the path. The data path that you use in indir variable should contain the actual MERL dataset, not other unnecessary files that are not ending with *.binary.

Download Dataset When you download the MERL BRDF dataset, the directory structure will be as follows:

brdf_merl/
├── Readme.txt
├── Copyright_Notice.txt
├── brdfs
│   ├── Readme.txt
│   └── *.binary (<--100 of them are here, omitted)
└── code
    └── BRDFRead.cpp

Modify the path In the README, the data path is set as follows:

indir="$proj_root/data/brdf_merl"

However, if you use this brdf_merl/ directory directly here, it will give you an error like me because the data generator code is trying to read the invalid folders/files such as Readme.txt, Copyright_Notice.txt, code/, and etc. Your folder path should only contain the *.binary files. Also, the brdfs/ folder has another Readme.txt file, so remember to get rid of this before running the code. Therefore, fixing the indir variable into:

indir="$proj_root/data/brdf_merl/brdfs"

will resolve the issue because this brdfs folder contains the actual dataset only.

*Side note: I also set ims='512' instead of 256 because in the later step where you train brdf priors, it seems like the data_root variable uses the one with the size of 512.