USGS-R / river-dl

Deep learning model for predicting environmental variables on river systems
Creative Commons Zero v1.0 Universal
21 stars 15 forks source link

sharing pytorch starting point #157

Closed jdiaz4302 closed 2 years ago

jdiaz4302 commented 2 years ago

I'd like to pass along the RGCN pytorch code soon, and I was wondering how you would prefer that to be passed along. My goal is to provide the RGCN code that mimics the river-dl model, the published/simplified model, demonstrates all/most of your pipeline needs, and shows some of the UQ that can accompany that. It isn't necessarily my goal to replace tensorflow* for the project but instead I wanted to provide the option or starting point for you all to run with it if it interests you, so I'm interested to hear how you'd like this to be shared (because it doesn't seem like a PR that would normally be forward progress of the project, it's more of a possible side route)

SimonTopp commented 2 years ago

Thanks for bringing this up @jdiaz4302. I was thinking about this as I reviewed Jeff's most recent PR. I think that ultimately we want functionality for both PyTorch and TF models. I've got a branch going right now that includes some functions to: 1) Reshape the output of prep_all_data so it can be applied across models with different format expectations, and 2) Train/predict with Pytorch models within the pipeline while maintaining the same file structure.

I was planning on it being somewhat of a standalone branch to go with the hold-out generalizability manuscript, but it might also just be a good way to incorporate the PyTorch work you've been doing. Selfishly, merging it with River-dl would also get some more eyes on my code which wouldn't be horrible either.
@janetrbarclay, @jsadler2, what do you think?

jdiaz4302 commented 2 years ago

I've got a branch going right now that includes some functions to....

  1. Train/predict with Pytorch models within the pipeline while maintaining the same file structure.

Nice! For clarification, have you been using the pipeline and RGCN in pytorch already?

jsadler2 commented 2 years ago

I definitely see a lot of pluses to making river-dl play with pytorch as well as TF. If you don't think it's too much of a lift, @SimonTopp, it seems like this could be a good opportunity to at least start to add that functionality. I'd be happy to help with PR reviews for it.

SimonTopp commented 2 years ago

Sounds good. I'll plan to submit a PR next week sometime after we merge #148. In terms of

Nice! For clarification, have you been using the pipeline and RGCN in pytorch already?

I've been using the pipeline to train GraphWaveNet, which is also in PyTorch. Plugging the torch RGCN in should be very straightforward. I should have it all debugged and ready to go today or early next week.

jdiaz4302 commented 2 years ago

Okay, if that's the case, my efforts might be better suited supplementing your work. Can you point me to that repo?

Also, let me know if you'd like my torch RGCN code passed along for any help debugging because it's been working for me independent of the pipeline

SimonTopp commented 2 years ago

For sure, branch is here with most of the integration code in gwn_integration_utils.py. It's a little sandboxy right now, but feel free to poke around all the same. Particularly would be curious to get some feedback on how I've set up the train_torch function. Given this conversation, I'll probably go back in to make the reshape_for_gwn function more generic.

jdiaz4302 commented 2 years ago

Nice job! I think the forward passing, weight updating, and early stopping are set up great.

In my experience using the torch dataloaders (as you have) make for a really easy lift to parallel GPUs via torch.nn.DataParallel. At a scan, I think both my RGCN models should feed into that without any big issues ( but I haven't tried to run them together yet.)

jdiaz4302 commented 2 years ago

Feel free to ping me to contribute those or review when you're ready

SimonTopp commented 2 years ago

Sounds good to me! If you want to contribute the torch models feel free to make a PR for them, alternatively I can just copy them out of the run-pgdl-da repository and request your review when I submit the PR. Let me know what you prefer.

jdiaz4302 commented 2 years ago

I can submit a PR to your branch later today with model code

SimonTopp commented 2 years ago

Addressed in #163