Some issues about large-scale scenes

Hello, First, thanks for your fantastic work! But I got some problems with large-scale scene rendering:

When loading DTU dataset, if there are two cameras or more, the focal and center will be averaged among these cameras. I just don't understand the reason why you do like this.
Every scan in DTU contains 49 images, however, what if the number of images in each scan is different? I know the code will have an error because of batch concatenation. I wonder will this influence the training process?
Is pixel-nerf suitable for large-scale scene dataset such as ETH3D?

Looking forward to your reply! Thanks Shengkun Tang

sxyu / pixel-nerf