please: one step take all 大神们，一步到位啊。

yuedajiong commented 4 months ago

one step take all:

way 1: offline/explicit

4d (time+stereo) strong physical stereo-consistent, any camera pose, offline render, interactive semantic highly-controllable，world.

OR

way 2: online/implicit binocular stereo-consistent generate, observe pose in-place change, online realtime, interactive semantic highly-controllable，world. and: the technical path is: train by binocular-video in Unreal Engine or Physical World.

你们组很强，有能力做到这种效果。视觉游戏的终局。

xiaoyuan1996 commented 4 months ago

一口吃胖子呢搁这，先 v 人 200 个 w

yuedajiong commented 4 months ago

@xiaoyuan1996 200万可不够哦，200亿美金才行。去找中东的王子，他们钱多。不一口吃个胖子，美其名曰超前一点，做的啥东西，都是在美国大公司和敏捷的创业公司后面跑。当然，本质问题，还是视觉问题终极还是类似这种：动态立体交互场景。就是UE中3D游戏，或者VR中的真立体游戏。和人肉身在物理世界的视觉，就是双目下看世界一切。

way2实现相对容易一点，小规模的概念版本已经有了。只是对象种类少，场景简单，分辨率低，双目有立体感但又错乱。用个quest或者双目对对眼可以左右视频，可以私下分享看效果。

基于Latte修改，借鉴sora，扩展一下，实现个概念比较容易。如果做立体，弄数据比较难。我在动态立体模型上通过固定间隔的双摄像头采集的合成数据做的。没有钱采集海量双目数据，更没有大集群训练sora那种视觉效果。

现在：场景：独立运动的人物，长度：几帧几十帧；双目立体视频。稍后贴在这里。

yuedajiong commented 4 months ago

the first one in the world:
ai based dynamic/temporal stereo/binocular generation:
https://github.com/yuedajiong/super-ai/blob/main/superai-20240223-vision-dynamic-stereo-binocular.gif

Vchitect / Latte

please: one step take all 大神们，一步到位啊。 #21