synthesiaresearch / humanrf

Official code for "HumanRF: High-Fidelity Neural Radiance Fields for Humans in Motion"
http://actors-hq.com
Other
448 stars 28 forks source link

Temporal stability vs Rendering quality #7

Closed MHCP001-YUI closed 1 year ago

MHCP001-YUI commented 1 year ago

Hi, Thanks for sharing such a good job. I tried it with my own custom data (one person in slow motion, captured in 360 degrees by 36 cameras). I found that the rendering quality is related to the division of segments.

The frequency of video flickering seems to be positively related to the division of segments. Is it possible to alleviate my problem by parameter setting? How to achieve the effect of the "temporal stability" in homepage?

isikmustafa commented 1 year ago

Hi, how many frames do you have in your sequence and how many steps do you train for? Rule of thumb for ActorsHQ is training for 1000*N iterations (for 4x downscaled data) where N is the number of frames you have in your video sequences. Although, this might differ for your dataset, you can try to follow the same guideline.

Moreover, for your dataset, you may need to adapt the expansion factor threshold. Setting 1.25 worked well for all the sequences for ActorsHQ.

MHCP001-YUI commented 1 year ago

Hi, how many frames do you have in your sequence and how many steps do you train for? Rule of thumb for ActorsHQ is training for 1000*N iterations (for 4x downscaled data) where N is the number of frames you have in your video sequences. Although, this might differ for your dataset, you can try to follow the same guideline.

Moreover, for your dataset, you may need to adapt the expansion factor threshold. Setting 1.25 worked well for all the sequences for ActorsHQ.

Thanks for your reply. In previous experiments, I have 100 frames in my sequence, and train for 60000 steps. However, even if I increase the number of iterations like 100000 as your replied, the quality basically does not increase. I try to adjust the expansion factor threshold, setting 1.00 for my slow motion sequence, the quality is the same as before.

Moreover, I increase the hash-map-size, the quality gets a bit better. I compare the single frame result generated by intstant-ngp, it is relatively poor in areas with high-frequency details such as the face and ears.

isikmustafa commented 1 year ago

So, when you run instant-ngp on a single frame, the quality is as expected? Because if the camera calibration is the issue, instant-ngp couldn't produce sharp results either.

Do you already have occupancy grids and per-frame masks for your dataset? Also, the current step size is set to 4e-4, this might need to be adjusted as well if you use a different scaling for your actors/subjects.

If the motion complexity of your dataset is similar to that of ActorsHQ, there shouldn't be any issues. Additionally, you can see how HumanRF performed on DFA (dynamic furry animal) dataset if you check the supplemental material of our paper. So, I believe there must a tiny detail missing for the optimal quality.

sandy-ssdut commented 1 year ago

Hi, I am a collaborator of the poster.

The camera calibration parameters are accurate, as the quality is as expected when runing instant-ngp on a single frame.

After decreasing camera_converge, I regenerated the occupancy grids of my data, and used dynamic partition in training. At this time, the rendering quality of still motion part and slow motion part in one sequence were both improved.

But the person in slow motion part is a little bit blurred than that in still motion part. Is this problem normal?

isikmustafa commented 1 year ago

Hi, if it is slightly blurrier, it should be fine. But if they are quite noticeable I would further check if everything works as expected.

For reference, you can watch the videos on humanrf website. We have results for both moderate motion and strong motion, and we produce decent results for both. However, if you check our supplemental for numerical comparison, you can see that HumanRF performs better when the motion is smaller due to its compression capability., and this is the expected outcome.