USGS-R / river-dl

Deep learning model for predicting environmental variables on river systems
Creative Commons Zero v1.0 Universal
21 stars 15 forks source link

remove train_model_cli #117

Closed jsadler2 closed 2 years ago

jsadler2 commented 3 years ago

In a meeting we decided to remove this because

  1. it is a maintenance burden
  2. it is largely project-specific
  3. it is only used in a small number of settings. The only time this is actually necessary is if you want to train separate models in parallel using GPUs. If you are just prototyping, it is easier to use the Pangeo-ML or an interactive GPU instance on Tallgrass
SimonTopp commented 2 years ago

I came up with a workaround to loading the necessary modules outside of the CLI for parallel training. It utilizes wildcards and group arguments within the Snakefile and cluster_config.yml. The biggest benefit to this is that we don't need to maintain two train scripts, one for the CLI and one for 'normal' runs. I might also be able to simplify it so that it doesn't need an additional rule by just using an os.system function within the training routine, but I haven't tested it yet. My question to you all @janetrbarclay, @jsadler2, @aappling-usgs is what's the best way to communicate this within the repository. My thoughts for now are:

image

janetrbarclay commented 2 years ago

I think this is a great option. I would be in favor of the 2nd (an example Snakefile) and either a blub in the readme or maybe a note in the readme and more extensive comments in the example snakefile than usual.


Janet Barclay U.S. Geological Survey New England Water Science Center Connecticut Office 101 Pitkin St. East Hartford, CT 06108

Phone (office) 860 291-6763 Fax 860 291-6799 Email @.**@*.**@*.***> https://www.usgs.gov/staff-profiles/janet-barclay


From: Simon Topp @.> Sent: Wednesday, November 17, 2021 9:21 AM To: USGS-R/river-dl @.> Cc: Barclay, Janet R @.>; Mention @.> Subject: [EXTERNAL] Re: [USGS-R/river-dl] remove train_model_cli (#117)

This email has been received from outside of DOI - Use caution before clicking on links, opening attachments, or responding.

I came up with a workaround to loading the necessary modules outside of the CLI for parallel training. It utilizes wildcards and group arguments within the Snakefile and cluster_config.yml. The biggest benefit to this is that we don't need to maintain two train scripts, one for the CLI and one for 'normal' runs. I might also be able to simplify it so that it doesn't need an additional rule by just using an os.system function within the training routine, but I haven't tested it yet. My question to you all @janetrbarclayhttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjanetrbarclay&data=04%7C01%7Cjbarclay%40usgs.gov%7C77d3099e45f044c1492008d9a9d59b93%7C0693b5ba4b184d7b9341f32f400a5494%7C0%7C0%7C637727557176433598%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=c9jCBDIBXArCrUi%2FM2CHt2r%2Bo%2F2NaPKlfWGtoIkUuyQ%3D&reserved=0, @jsadler2https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjsadler2&data=04%7C01%7Cjbarclay%40usgs.gov%7C77d3099e45f044c1492008d9a9d59b93%7C0693b5ba4b184d7b9341f32f400a5494%7C0%7C0%7C637727557176433598%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Nl2woCj6ErIyxqRCa%2BW%2Fb725XnVWYjUx7buhUmbF%2F2s%3D&reserved=0, @aappling-usgshttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Faappling-usgs&data=04%7C01%7Cjbarclay%40usgs.gov%7C77d3099e45f044c1492008d9a9d59b93%7C0693b5ba4b184d7b9341f32f400a5494%7C0%7C0%7C637727557176443551%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=TYyyMcMilZAu92ZGK4AEF0YcHrWmEaUkDoaZtFo5u7g%3D&reserved=0 is what's the best way to communicate this within the repository. My thoughts for now are:

[image]https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F33098707%2F142216496-74660fa6-7328-4061-a155-0835c0d29bf8.png&data=04%7C01%7Cjbarclay%40usgs.gov%7C77d3099e45f044c1492008d9a9d59b93%7C0693b5ba4b184d7b9341f32f400a5494%7C0%7C0%7C637727557176453507%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=pyKvS2W7yvlk%2FjMnr%2FTS6TqSY4dIFMeApK6evyz2kEw%3D&reserved=0

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FUSGS-R%2Friver-dl%2Fissues%2F117%23issuecomment-971627738&data=04%7C01%7Cjbarclay%40usgs.gov%7C77d3099e45f044c1492008d9a9d59b93%7C0693b5ba4b184d7b9341f32f400a5494%7C0%7C0%7C637727557176453507%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2B6CE2rSRAPe2ylanK2cpnZxAgh%2FDPONJtsGMcSGC4O0%3D&reserved=0, or unsubscribehttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAA5H7UECXRUF66NWG654353UMO27LANCNFSM47EDPGXQ&data=04%7C01%7Cjbarclay%40usgs.gov%7C77d3099e45f044c1492008d9a9d59b93%7C0693b5ba4b184d7b9341f32f400a5494%7C0%7C0%7C637727557176463456%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=3KjxipHfApKX%2BYNhI5qd%2FCYKXd5KeuKALe%2BhXvPEkFc%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7Cjbarclay%40usgs.gov%7C77d3099e45f044c1492008d9a9d59b93%7C0693b5ba4b184d7b9341f32f400a5494%7C0%7C0%7C637727557176463456%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=qsSme4MElNyvdIJkzTOl7mLYtT%2FWBmcGJVPH%2FwP9krI%3D&reserved=0 or Androidhttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7Cjbarclay%40usgs.gov%7C77d3099e45f044c1492008d9a9d59b93%7C0693b5ba4b184d7b9341f32f400a5494%7C0%7C0%7C637727557176473417%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=iYDaIZMWS36Uc2bL0S7%2FU7k6CfYyRCRjIcz%2Fk0CgRrE%3D&reserved=0.

SimonTopp commented 2 years ago

@jsadler2 and @janetrbarclay, what do you both think about closing this and removing train_model_cli. I think we've slowly addressed this through various PRs since it was posted.

jsadler2 commented 2 years ago

Let's do it!

On Tue, Jan 25, 2022 at 1:21 PM Simon Topp @.***> wrote:

@jsadler2 https://github.com/jsadler2 and @janetrbarclay https://github.com/janetrbarclay, what do you both think about closing this and removing train_model_cli. I think we've slowly addressed this through various PRs since it was posted.

— Reply to this email directly, view it on GitHub https://github.com/USGS-R/river-dl/issues/117#issuecomment-1021527481, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABU7FUPMCMOPRRH2J77Q27DUX3Z4HANCNFSM47EDPGXQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>