deepmodeling / deepmd-kit

A deep learning package for many-body potential energy representation and molecular dynamics
https://docs.deepmodeling.com/projects/deepmd/
GNU Lesser General Public License v3.0
1.41k stars 487 forks source link

feat(pt): Add command to check the available model branches in multi-task pre-trained model(Issue #3742) #3796

Closed Chengqian-Zhang closed 1 month ago

Chengqian-Zhang commented 1 month ago

Solve #3742

  1. Situation one(The right way to use it): dp --pt show multitask_model.pt model-branch type-map descriptor fitting-net [2024-05-22 10:38:16,678] DEEPMD INFO This is a multitask model [2024-05-22 10:38:16,678] DEEPMD INFO Available model branches are ['MPtraj_v026_01-mix-Utype', 'MPtraj_v026_02-mix-Utype', 'MPtraj_v026_03-mix-Utype', 'MPtraj_v026_04-mix-Utype', 'MPtraj_v026_05-mix-Utype', 'MPtraj_v026_06-mix-Utype', 'MPtraj_v026_07-mix-Utype', 'MPtraj_v026_08-mix-Utype', 'MPtraj_v026_09-mix-Utype', 'MPtraj_v026_10-mix-Utype', 'MPtraj_v026_11-mix-Utype'] [2024-05-22 10:38:16,679] DEEPMD INFO The type_map of branch MPtraj_v026_01-mix-Utype is ['H', 'He', 'Li', 'Be', 'B', 'C', 'N', 'O', 'F', 'Ne', 'Na', 'Mg', 'Al', 'Si', 'P', 'S', 'Cl', 'Ar', 'K', 'Ca', 'Sc', 'Ti', 'V', 'Cr', 'Mn', 'Fe', 'Co', 'Ni', 'Cu', 'Zn', 'Ga', 'Ge', 'As', 'Se', 'Br', 'Kr', 'Rb', 'Sr', 'Y', 'Zr', 'Nb', 'Mo', 'Tc', 'Ru', 'Rh', 'Pd', 'Ag', 'Cd', 'In', 'Sn', 'Sb', 'Te', 'I', 'Xe', 'Cs', 'Ba', 'La', 'Ce', 'Pr', 'Nd', 'Pm', 'Sm', 'Eu', 'Gd', 'Tb', 'Dy', 'Ho', 'Er', 'Tm', 'Yb', 'Lu', 'Hf', 'Ta', 'W', 'Re', 'Os', 'Ir', 'Pt', 'Au', 'Hg', 'Tl', 'Pb', 'Bi', 'Po', 'At', 'Rn', 'Fr', 'Ra', 'Ac', 'Th', 'Pa', 'U', 'Np', 'Pu', 'Am', 'Cm', 'Bk', 'Cf', 'Es', 'Fm', 'Md', 'No', 'Lr', 'Rf', 'Db', 'Sg', 'Bh', 'Hs', 'Mt', 'Ds', 'Rg', 'Cn', 'Nh', 'Fl', 'Mc', 'Lv', 'Ts', 'Og', 'Co_U', 'Cr_U', 'Fe_U', 'Mn_U', 'Mo_U', 'Ni_U', 'V_U', 'W_U'] (skip other branches' output) [2024-05-22 10:38:16,679] DEEPMD INFO The descriptor parameter of branch MPtraj_v026_04-mix-Utype is {'type': 'dpa2', 'repinit': {'tebd_dim': 256, 'rcut': 9.0, 'rcut_smth': 8.0, 'nsel': 120, 'neuron': [25, 50, 100], 'axis_neuron': 12, 'activation_function': 'tanh'}, 'repformer': {'rcut': 4.0, 'rcut_smth': 3.5, 'nsel': 40, 'nlayers': 12, 'g1_dim': 128, 'g2_dim': 32, 'attn2_hidden': 32, 'attn2_nhead': 4, 'attn1_hidden': 128, 'attn1_nhead': 4, 'axis_neuron': 4, 'activation_function': 'tanh', 'update_h2': False, 'update_g1_has_conv': True, 'update_g1_has_grrg': True, 'update_g1_has_drrd': True, 'update_g1_has_attn': True, 'update_g2_has_g1g1': False, 'update_g2_has_attn': True, 'update_style': 'res_residual', 'update_residual': 0.01, 'update_residual_init': 'norm', 'attn2_has_gate': True}, 'add_tebd_to_repinit_out': False} (skip other branches' output) [2024-05-22 10:38:16,679] DEEPMD INFO The fitting_net parameter of branch MPtraj_v026_01-mix-Utype is {'neuron': [240, 240, 240], 'activation_function': 'tanh', 'resnet_dt': True, 'seed': 1, '_comment': " that's all"} (skip other branches' output)

  2. Situation two (singletask_model.pt is not a multi-task pre-trained model) dp --pt show singletask_model.pt model-branch type-map descriptor fitting-net [2024-05-22 10:43:11,642] DEEPMD INFO This is a singletask model RuntimeError: The 'model-branch' option requires a multitask model. The provided model does not meet this criterion.

  3. Situation three(using tf backend) dp show multitask_model.pt model-branch RuntimeError: unknown command list-model-branch

  4. Frozen model file with a .pth extension are used in the same way as checkpoint file with a .pt extension. dp --pt show frozen_model.pth type-map descriptor fitting-net [2024-05-22 10:46:26,365] DEEPMD INFO This is a singletask model [2024-05-22 10:46:26,365] DEEPMD INFO The type_map is ['H', 'He', 'Li', 'Be', 'B', 'C', 'N', 'O', 'F', 'Ne', 'Na', 'Mg', 'Al', 'Si', 'P', 'S', 'Cl', 'Ar', 'K', 'Ca', 'Sc', 'Ti', 'V', 'Cr', 'Mn', 'Fe', 'Co', 'Ni', 'Cu', 'Zn', 'Ga', 'Ge', 'As', 'Se', 'Br', 'Kr', 'Rb', 'Sr', 'Y', 'Zr', 'Nb', 'Mo', 'Tc', 'Ru', 'Rh', 'Pd', 'Ag', 'Cd', 'In', 'Sn', 'Sb', 'Te', 'I', 'Xe', 'Cs', 'Ba', 'La', 'Ce', 'Pr', 'Nd', 'Pm', 'Sm', 'Eu', 'Gd', 'Tb', 'Dy', 'Ho', 'Er', 'Tm', 'Yb', 'Lu', 'Hf', 'Ta', 'W', 'Re', 'Os', 'Ir', 'Pt', 'Au', 'Hg', 'Tl', 'Pb', 'Bi', 'Po', 'At', 'Rn', 'Fr', 'Ra', 'Ac', 'Th', 'Pa', 'U', 'Np', 'Pu', 'Am', 'Cm', 'Bk', 'Cf', 'Es', 'Fm', 'Md', 'No', 'Lr', 'Rf', 'Db', 'Sg', 'Bh', 'Hs', 'Mt', 'Ds', 'Rg', 'Cn', 'Nh', 'Fl', 'Mc', 'Lv', 'Ts', 'Og', 'Co_U', 'Cr_U', 'Fe_U', 'Mn_U', 'Mo_U', 'Ni_U', 'V_U', 'W_U'] [2024-05-22 10:46:26,365] DEEPMD INFO The descriptor parameter is {'type': 'dpa2', 'repinit': {'tebd_dim': 256, 'rcut': 9.0, 'rcut_smth': 8.0, 'nsel': 120, 'neuron': [25, 50, 100], 'axis_neuron': 12, 'activation_function': 'tanh'}, 'repformer': {'rcut': 4.0, 'rcut_smth': 3.5, 'nsel': 40, 'nlayers': 12, 'g1_dim': 128, 'g2_dim': 32, 'attn2_hidden': 32, 'attn2_nhead': 4, 'attn1_hidden': 128, 'attn1_nhead': 4, 'axis_neuron': 4, 'activation_function': 'tanh', 'update_h2': False, 'update_g1_has_conv': True, 'update_g1_has_grrg': True, 'update_g1_has_drrd': True, 'update_g1_has_attn': True, 'update_g2_has_g1g1': False, 'update_g2_has_attn': True, 'update_style': 'res_residual', 'update_residual': 0.01, 'update_residual_init': 'norm', 'attn2_has_gate': True}, 'add_tebd_to_repinit_out': False} [2024-05-22 10:46:26,365] DEEPMD INFO The fitting_net parameter is {'neuron': [240, 240, 240], 'activation_function': 'tanh', 'resnet_dt': True, 'seed': 1, '_comment': " that's all"}

Summary by CodeRabbit

coderabbitai[bot] commented 1 month ago

Walkthrough

The changes introduce a new command-line argument --list-model-branch to list model branches in a multitask pretrained model, supported by the PyTorch backend. The updates include handling this flag in the training command, adding a display function for model information, and enhancing test cases for single-task and multi-task models. Additionally, a utility function to run DP directly from the entry point has been added to improve testing performance.

Changes

File/Directory Summary
deepmd/main.py Added --list-model-branch argument to parser_train for listing model branches in multitask pretrained models.
deepmd/pt/entrypoints/main.py Added logic to handle --list-model-branch flag in the train command and a show function to display model info.
doc/train/finetuning.md Updated documentation to include the new --list-model-branch command for checking model branches.
source/tests/pt/common.py Introduced run_dp function to run DP directly from the entry point, improving performance by avoiding subprocess use.
source/tests/pt/test_dp_show.py Added test cases for single-task and multi-task models, including setup, training, and model information display.

Sequence Diagram(s) (Beta)

sequenceDiagram
    participant User
    participant CLI
    participant Main
    participant EntryPoints

    User->>CLI: Execute train command with --list-model-branch
    CLI->>Main: Parse arguments
    Main->>EntryPoints: Call train function with flags
    EntryPoints-->>EntryPoints: Check --list-model-branch flag
    alt Flag is set
        EntryPoints->>EntryPoints: Process pretrained model for multitask mode
        EntryPoints->>EntryPoints: Extract and display model branches
    else Flag is not set
        EntryPoints->>EntryPoints: Proceed with regular training process
    end
    EntryPoints-->>Main: Return control
    Main-->>CLI: Finish execution
    CLI-->>User: Display results

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share - [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai) - [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai) - [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai) - [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)
Tips ### Chat There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai): - Review comments: Directly reply to a review comment made by CodeRabbit. Example: - `I pushed a fix in commit .` - `Generate unit testing code for this file.` - `Open a follow-up GitHub issue for this discussion.` - Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples: - `@coderabbitai generate unit testing code for this file.` - `@coderabbitai modularize this function.` - PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples: - `@coderabbitai generate interesting stats about this repository and render them as a table.` - `@coderabbitai show all the console.log statements in this repository.` - `@coderabbitai read src/utils.ts and generate unit testing code.` - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.` - `@coderabbitai help me debug CodeRabbit configuration file.` Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. ### CodeRabbit Commands (invoked as PR comments) - `@coderabbitai pause` to pause the reviews on a PR. - `@coderabbitai resume` to resume the paused reviews. - `@coderabbitai review` to trigger an incremental review. This is useful when automatic reviews are disabled for the repository. - `@coderabbitai full review` to do a full review from scratch and review all the files again. - `@coderabbitai summary` to regenerate the summary of the PR. - `@coderabbitai resolve` resolve all the CodeRabbit review comments. - `@coderabbitai configuration` to show the current CodeRabbit configuration for the repository. - `@coderabbitai help` to get help. Additionally, you can add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed. ### CodeRabbit Configration File (`.coderabbit.yaml`) - You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository. - Please see the [configuration documentation](https://docs.coderabbit.ai/guides/configure-coderabbit) for more information. - If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json` ### Documentation and Community - Visit our [Documentation](https://coderabbit.ai/docs) for detailed information on how to use CodeRabbit. - Join our [Discord Community](https://discord.com/invite/GsXnASn26c) to get help, request features, and share feedback. - Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.
codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 97.87234% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 77.78%. Comparing base (0bcb84f) to head (0d1c29d). Report is 1 commits behind head on devel.

:exclamation: Current head 0d1c29d differs from pull request most recent head d9cb79f

Please upload reports for the commit d9cb79f to get more accurate results.

Files Patch % Lines
deepmd/pt/entrypoints/main.py 97.72% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## devel #3796 +/- ## ========================================== - Coverage 82.48% 77.78% -4.70% ========================================== Files 513 414 -99 Lines 48993 35480 -13513 Branches 2986 926 -2060 ========================================== - Hits 40411 27599 -12812 + Misses 7671 7365 -306 + Partials 911 516 -395 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Chengqian-Zhang commented 1 month ago

Please replace the corresponding note in doc/train/finetuning.md.

Done

Chengqian-Zhang commented 1 month ago

I don't see any reason to use dp train for this feature.

I have change it to dp --pt list-model-branch model.pt

Chengqian-Zhang commented 1 month ago

You can review this PR again if you have time. @iProzd @njzjz

wanghan-iapcm commented 1 month ago

@njzjz @iProzd @Chengqian-Zhang hey guys, how about providing a command like

dp show

to present the attributes of the models? the command of showing heads is provided as an option for dp show

Chengqian-Zhang commented 1 month ago

@njzjz @iProzd @Chengqian-Zhang hey guys, how about providing a command like

dp show

to present the attributes of the models? the command of showing heads is provided as an option for dp show

I'm in favor of this idea. what do @iProzd and @njzjz think? The implementation examples could be: dp show model.pt --list-model-branch is used to show the model branches of the pre-trained multi-task model. dp show model.pt --descriptor is used to show the parameters of the descriptor module of the pre-trained model. dp show model.pt --type-map is used to show the elements covered by the pre-trained model.

Chengqian-Zhang commented 1 month ago

I noticed that UT failed, but it doesn't seem to have anything to do with this PR.

wanghan-iapcm commented 1 month ago

plz resolve the conserations by coderabbitaicoderabbitai

Chengqian-Zhang commented 1 month ago

I don't understand why UT in "merge queue" failed.....