salesforce / ULIP

BSD 3-Clause "New" or "Revised" License
382 stars 35 forks source link

Do you have a plan to pre-train ULIP on a larger dataset, objaverse-xl, contraining 10M 3D objects? #56

Open auniquesun opened 1 month ago

auniquesun commented 1 month ago

As the title shows, do you have such a plan of pre-training on objaverse-xl, a 10M scale 3D datasets?

I think the computing source in your company is sufficient to finish such an exiciting work.

Tycho-Xue commented 3 weeks ago

Hi @auniquesun , sorry for the late reply, if you intend to get my response sooner you can tag me here then I will receive an email otherwise I won't be notified unless I check. objaverse-xl is sth i planed to do but currently don't have the bandwidth, I've actually tried it last year when it's first open-sourced, but at that time the preprocessing of objaverse-xl was kind of complicated, I need to extract point clouds (better colored) and multi-view images, then I'll need to caption the multi-view images. At least at that time, the objaverse-xl was not like objaverse, which can be directly downloaded and easier on the preprocessing. With this whole pipeline as ULIP-2, I met quite some issues for preprocessing all the data and I had some other more urgent stuff to work on, so I didn't continue.