alexhernandezgarcia / gflownet

Generative Flow Networks - GFlowNet
https://gflownet.readthedocs.io/en/latest/
Apache License 2.0
173 stars 11 forks source link

Training MLE-style on continuous env? #342

Open pieris98 opened 3 days ago

pieris98 commented 3 days ago

Hey @alexhernandezgarcia and others, As I mentioned in #330 , I want to use Continuous GFNs to estimate the distribution of a 3D dataset for my thesis. I thought a nice starting point would be ModelNet40, since the whole dataset has a fixed 2048 3D coordinate points per point cloud sample. Then, if that works, I could move to more complex datasets like ShapeNetCore.

As Alex suggested in #330, I'm trying to use the ContinuousCube env to sample this fixed number of 3D vectors. My questions are:

  1. Should I use the vanilla objectives (i.e. most probably TB?) Or is there a better/alternative objective to train on the continuous dataset similar to what Lahlou et al. did for 2D images?
  2. I was also thinking it'd be a good idea to start with an MLE-like objective to directly learn the training data distribution. However, I don't know how to do this via the proxy implementation or if it's possible for continuous domains under this repo.

Could you help me clarify what is possible here? Huge thanks for all your contributions and work on this repo. Best, Pieris.

alexhernandezgarcia commented 3 days ago

Hi Pieris,

Cook to hear you're working on this! I'm currently travelling so I can't be of much help and the moment (and until mid-November) but wanted to mention one thing: there's some recently developed code that could potentially make your life a lot easier for this task, but it would still take a few weeks to get merged.

To your questions:

  1. Go with TB.
  2. Yes, MLE-like objective seems the way to go indeed.

On 15 October 2024 15:27:33 GMT, Pieris Kalligeros @.***> wrote:

Hey @alexhernandezgarcia and others, As I mentioned in #330 , I want to use Continuous GFNs to estimate the distribution of a 3D dataset for my thesis. I thought a nice starting point would be ShapeNet, since the whole dataset has a fixed 2048 3D coordinate points per point cloud sample.

As Alex suggested in #330, I'm trying to use the ContinuousCube env to sample this fixed number of 3D vectors. My questions are:

  1. Should I use the vanilla objectives (i.e. most probably TB?) Or is there a better/alternative objective to train on the continuous dataset similar to what Lahlou et al. did for 2D images?
  2. I was also thinking it'd be a good idea to start with an MLE-like objective to directly learn the training data distribution. However, I don't know how to do this via the proxy implementation or if it's possible for continuous domains under this repo.

Could you help me clarify what is possible here? Huge thanks for all your contributions and work on this repo. Best, Pieris.

-- Reply to this email directly or view it on GitHub: https://github.com/alexhernandezgarcia/gflownet/issues/342 You are receiving this because you were mentioned.

Message ID: @.***> -- Sent from /e/ Mail.

pieris98 commented 3 days ago

Hey Alex, I really appreciate your time to reply despite being on the move! Unfortunately time is of the essence; my thesis is due early November. Would you know if the code is available as a pull-request or branch? Or is it developed internally? Thanks again for all your help. Best, Pieris.

Oct 16, 2024 01:36:30 Alex @.***>:

Hi Pieris,

Cook to hear you're working on this! I'm currently travelling so I can't be of much help and the moment (and until mid-November) but wanted to mention one thing: there's some recently developed code that could potentially make your life a lot easier for this task, but it would still take a few weeks to get merged.

To your questions:

  1. Go with TB.
  2. Yes, MLE-like objective seems the way to go indeed.

On 15 October 2024 15:27:33 GMT, Pieris Kalligeros @.***> wrote:

Hey @alexhernandezgarcia and others, As I mentioned in #330 , I want to use Continuous GFNs to estimate the distribution of a 3D dataset for my thesis. I thought a nice starting point would be ShapeNet, since the whole dataset has a fixed 2048 3D coordinate points per point cloud sample.

As Alex suggested in #330, I'm trying to use the ContinuousCube env to sample this fixed number of 3D vectors. My questions are:

  1. Should I use the vanilla objectives (i.e. most probably TB?) Or is there a better/alternative objective to train on the continuous dataset similar to what Lahlou et al. did for 2D images?
  2. I was also thinking it'd be a good idea to start with an MLE-like objective to directly learn the training data distribution. However, I don't know how to do this via the proxy implementation or if it's possible for continuous domains under this repo.

Could you help me clarify what is possible here? Huge thanks for all your contributions and work on this repo. Best, Pieris.

-- Reply to this email directly or view it on GitHub: https://github.com/alexhernandezgarcia/gflownet/issues/342 You are receiving this because you were mentioned.

Message ID: @.***> -- Sent from /e/ Mail.

— Reply to this email directly, view it on GitHub[https://github.com/alexhernandezgarcia/gflownet/issues/342#issuecomment-2415287658], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AIEO4KMKYH32LGDBIVZ6BR3Z3WKG3AVCNFSM6AAAAABP7NXFG6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJVGI4DONRVHA]. You are receiving this because you authored the thread. [Tracking image][https://github.com/notifications/beacon/AIEO4KO3VFIWDZTPSHOY3YDZ3WKG3A5CNFSM6AAAAABP7NXFG6WGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUP6ZOWU.gif]

alexhernandezgarcia commented 3 days ago

It's only internal (not public) for now.

On 15 October 2024 22:56:14 GMT, Pieris Kalligeros @.***> wrote:

Hey Alex, I really appreciate your time to reply despite being on the move! Unfortunately time is of the essence; my thesis is due early November. Would you know if the code is available as a pull-request or branch? Or is it developed internally? Thanks again for all your help. Best, Pieris.

Oct 16, 2024 01:36:30 Alex @.***>:

Hi Pieris,

Cook to hear you're working on this! I'm currently travelling so I can't be of much help and the moment (and until mid-November) but wanted to mention one thing: there's some recently developed code that could potentially make your life a lot easier for this task, but it would still take a few weeks to get merged.

To your questions:

  1. Go with TB.
  2. Yes, MLE-like objective seems the way to go indeed.

On 15 October 2024 15:27:33 GMT, Pieris Kalligeros @.***> wrote:

Hey @alexhernandezgarcia and others, As I mentioned in #330 , I want to use Continuous GFNs to estimate the distribution of a 3D dataset for my thesis. I thought a nice starting point would be ShapeNet, since the whole dataset has a fixed 2048 3D coordinate points per point cloud sample.

As Alex suggested in #330, I'm trying to use the ContinuousCube env to sample this fixed number of 3D vectors. My questions are:

  1. Should I use the vanilla objectives (i.e. most probably TB?) Or is there a better/alternative objective to train on the continuous dataset similar to what Lahlou et al. did for 2D images?
  2. I was also thinking it'd be a good idea to start with an MLE-like objective to directly learn the training data distribution. However, I don't know how to do this via the proxy implementation or if it's possible for continuous domains under this repo.

Could you help me clarify what is possible here? Huge thanks for all your contributions and work on this repo. Best, Pieris.

-- Reply to this email directly or view it on GitHub: https://github.com/alexhernandezgarcia/gflownet/issues/342 You are receiving this because you were mentioned.

Message ID: @.***> -- Sent from /e/ Mail.

— Reply to this email directly, view it on GitHub[https://github.com/alexhernandezgarcia/gflownet/issues/342#issuecomment-2415287658], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AIEO4KMKYH32LGDBIVZ6BR3Z3WKG3AVCNFSM6AAAAABP7NXFG6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJVGI4DONRVHA]. You are receiving this because you authored the thread. [Tracking image][https://github.com/notifications/beacon/AIEO4KO3VFIWDZTPSHOY3YDZ3WKG3A5CNFSM6AAAAABP7NXFG6WGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUP6ZOWU.gif]

-- Reply to this email directly or view it on GitHub: https://github.com/alexhernandezgarcia/gflownet/issues/342#issuecomment-2415306968 You are receiving this because you were mentioned.

Message ID: @.***> -- Sent from /e/ Mail.