Palace test dataset? - Githubissues

VRabbitHole commented 8 years ago

Have the 360 Render App up and running on Ubuntu 16.04 LTS .... very cool .. thanks for the help.

Any chance of giving us access to the 'Palace of Fine Arts' RAW .bin dataset? ... we know what the demo looks like so it would be a good reference to play with.

Also, the hardware spec for the storage suggests 8 x 1TB SSDs for the FS .... presume nvme drives would work OK? ... or 2 x pools of 8 x SSDs each ... one for READ and one for WRITE?

superdog8 commented 8 years ago

NVME drives will work fine. The recommended storage capacity and throughput requirements are for recording 1 hour of footage. These requirements are not considered to be required minimum for the rendering part. Once you have the raw footage, the disk throughput is not so important for the actual processing, since rendering is CPU bound.

VRabbitHole commented 8 years ago

Thanks. Re CPU processing of RAW files ... a single fast OC i7 (8 or 10 cores) would still be better than dual proc Xeon 12cores?

fbriggs commented 8 years ago

I don't know which would be faster without doing the experiment.

VRabbitHole commented 8 years ago

If there's anyway you could let us play with some RAW footage I'd be happy to real-life benchmark a fast single i7 versus dual 12 core Xeons and share the results .... I have two beefy Linux PCs with two fast SSDs/NVMe pools for R/W which would ensure that we're not I/O bound and would easily let us see if there is any real differences between Core/Xeon CPU render pathways ... when we first started working on debayering RED, Arri and Sony RAW footage to Lin RGB we were continually up against the problem of 'shifting bottlenecks' ... sometimes it was GPU bound, other times I/O or CPU bound or combinations thereof ... the last thing we want to be is in the situation like we are with OZO footage where render times are 14 seconds per frame (albeit on a beefy Mac Pro cylinder - PC version of their software not available yet) ... let me know if there's anyway to have access to some RAW footage you know will stitch and render elegantly ... happy to sign NDA, etc. I also have multiple Titan Xs and W9100s if the render code will leverage OpenCL/GL (or CUDA) in any way ... Moore's Law is driving GPU innovation way faster than CPU at current silicon fab architectures.

fbriggs commented 8 years ago

here are 2 frames from the palace dataset: https://fburl.com/399375715 we will provide a larger dataset ASAP

VRabbitHole commented 8 years ago

Yeah! ... thanks :-) Just so you know the GreyPoint cameras are on back order until early Oct .... keep them thar frames a'comin.

VRabbitHole commented 8 years ago

Just checked out the links on both threads and got this message:

Sorry, this page isn't available The link you followed may be broken, or the page may have been removed.

fbriggs commented 8 years ago

Thanks for letting us know.. the link works for me, so well have to investigate what is going on here. Can you check if this link works?

https://s3-us-west-2.amazonaws.com/surround360/sample/sample_dataset.zip

VRabbitHole commented 8 years ago

puurrrfect ... downloading 322MB as we speak ... yep, dataset downloaded just fine :)

VRabbitHole commented 8 years ago

Thanks for the note on the black levels ... we're using a calibrated Sony grading monitor so important to know what gamm/luma changes you're making to the .isp processing.

Are you using gamma 2.2 or 2.4? .... and Color Space Rec.709? ... or some other standard in the colorimetry?

bkcabral commented 8 years ago

Gamma is 1.0/2.2 sRGB color space.

VRabbitHole commented 8 years ago

Thanks for clarification, Brian. Great specs, btw ... still a little surprised you decided not to leverage GPU in the render pipeline ... your previous company just came out with GTX 1080 which delivers stunning price/performance for video processing.

fbriggs commented 8 years ago

We are looking in to GPU acceleration. However, currently the majority of runtime is spent in optical flow, and we have an very fast CPU implementation of optical flow which does not trivially parallelize. In fact we have compared our implementation to OpenCV's optical flow using OpenCL, and our CPU code is faster + produces better results.

bkcabral commented 8 years ago

It precisely because I built a lot of GPU's that we chose not do it for the first release. My experience is that it takes 2x to 10x the coding development time to write CUDA code. So we had a choice: delay the release 3 or 4 months or get it out to the world faster and have the community help us.

VRabbitHole commented 8 years ago

Makes sense ... and agree, the Optical Flow algorithm you've incorporated is the 'secret sauce' to making the FB stereo 3D effect as good as it is ... just saw Brian's comment on challenges of writing optimized CUDA code .... will start benchmarking single i7 versus dual Xeons to see if we get any real-life differences.

mrsabhar commented 8 years ago

@fbriggs "In fact we have compared our implementation to OpenCV's optical flow using OpenCL, and our CPU code is faster + produces better results.". Is it possible to share how slow was OCL implementation vs. CPU?

fbriggs commented 8 years ago

I don't have the results anymore, and re-running the test is not a high priority because the results were not good. The method in question is: http://docs.opencv.org/3.1.0/d6/d39/classcv_1_1cuda_1_1OpticalFlowDual__TVL1.html

fbriggs commented 8 years ago

To give some rough numbers.. we found that Dual_TVL1 flow was about 4x faster than DeepFlow, and our current flow (PixFlow) is is between 3x and 20x faster than DeepFlow (depending on resolution and quality settings). However, it is worth noting that we solve 14 flow problems in parallel using multiple threads, which works well on a 16 CPU system but would have scheduling issues on a single GPU. All of these numbers can change depending on different settings of parameters. I never got really high quality results from Dual_TVL1 flow, regardless of settings.

If you are interested in trying out some GPU flow algorithms, it is relatively simple to add a new flow algorithm to our system. To do so, you would define a new implementation of the interface in OpticalFlowInterface.h, and add a new entry to OpticalFlowFactory.h which allows the flow algorithm to be referenced by name. Flow algorithms can be tested in isolation using the TestOpticalFlow program, which measures runtime and generates visualizations.

We tested Dual_TVL1 flow because it is already in OpenCV. However, if you are interested in trying a more advanced / probably faster GPU accelerated optical flow algorithm, I recommend starting with FlowNet by Thomas Brox (http://lmb.informatik.uni-freiburg.de/people/brox/publication_selected.html).

fbriggs commented 8 years ago

Getting FlowNet working in our system would be a very nice contribution to the project :)

fbriggs commented 8 years ago

for further discussion of optical flow, please open a separate issue. closing this thread as it was originally on the topic of sample data.

facebookarchive / Surround360

Palace test dataset? #25