Closed njzjz closed 2 weeks ago
The main changes introduce a seed
parameter across numerous classes and functions in the deepmd
module, supporting both single integers and lists of integers. These adjustments ensure the seeding mechanism is uniformly and effectively passed down to the network layers, thereby enhancing the reproducibility and randomness control.
Files/Modules | Changes Summary |
---|---|
deepmd/dpmodel/descriptor/dpa1.py |
Added seed parameter to constructors. |
deepmd/dpmodel/descriptor/dpa2.py |
Updated init_subclass_params to include seed parameter. |
deepmd/dpmodel/descriptor/repformers.py |
Enhanced DescrptBlockRepformers class with various new parameters, including seed . |
deepmd/dpmodel/descriptor/se_atten_v2.py |
Modified seed in DescrptDPA1 class constructor to accept lists. |
deepmd/dpmodel/descriptor/se_e2_a.py |
Modified calculation and usage of seed in __init__ method. |
deepmd/dpmodel/descriptor/se_r.py |
Added seed parameter with calculation in constructor. |
deepmd/dpmodel/descriptor/se_t.py |
Enhanced seeding logic with child_seed function. |
deepmd/dpmodel/fitting/dipole_fitting.py |
Updated seed value to accept both integer and list types. |
deepmd/dpmodel/fitting/dos_fitting.py |
Updated seed value to accept both integer and list types. |
deepmd/dpmodel/fitting/ener_fitting.py |
Updated seed value to accept both integer and list types. |
deepmd/dpmodel/fitting/general_fitting.py |
Added seed parameter in GeneralFitting class. |
deepmd/dpmodel/fitting/polarizability_fitting.py |
Updated seed value to accept both integer and list types. |
deepmd/dpmodel/utils/network.py |
Introduced optional precision and seed parameters in various class constructors. |
deepmd/dpmodel/utils/seed.py |
Added child_seed function for generating child seeds. |
deepmd/dpmodel/utils/type_embed.py |
Added seed parameter to __init__ method of affected class. |
deepmd/pt/model/descriptor/dpa1.py |
Enhanced seed logic to support integers and lists in class constructor. |
deepmd/pt/model/descriptor/dpa2.py |
Modified init_subclass_params to handle seed with additional computations and child_seed function. |
deepmd/pt/model/descriptor/repformer_layer.py |
Enhanced RepformerLayer class by modifying seed values to control random seed behavior effectively. |
deepmd/pt/model/descriptor/se_atten.py |
Enhanced seeding logic with customized calculations for seed parameter based on existing attributes. |
deepmd/pt/model/network/network.py |
Adjusted __init__ method to modify seed parameter behavior. |
deepmd/pt/utils/utils.py |
Enhanced get_generator function to accept lists as seeds and hash them to generate a torch generator seed. |
N/A
Objective (Issue #3799) | Addressed | Explanation |
---|---|---|
Ensure seed parameters are passed to network layers |
✅ | |
Enhance seed parameter to accept lists of integers |
✅ | |
Modify seeding logic using child_seed function |
✅ |
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
Attention: Patch coverage is 97.89474%
with 2 lines
in your changes missing coverage. Please review.
Project coverage is 82.74%. Comparing base (
c644314
) to head (8aacdea
). Report is 3 commits behind head on devel.
Files | Patch % | Lines |
---|---|---|
deepmd/dpmodel/descriptor/repformers.py | 83.33% | 1 Missing :warning: |
deepmd/dpmodel/utils/seed.py | 92.30% | 1 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I find one has to spend much effort on maintaining the number of seeds used by each module. I would recommend the parallel random number generation feature by which we do not need to manually record the number of consumed seeds. see e.g. https://numpy.org/doc/stable/reference/random/parallel.html
It's unclear how to share arguments with PT when input is a numpy.random.SeedSequence
, considering PT also uses several functions in dpmodel.
I find one has to spend much effort on maintaining the number of seeds used by each module. I would recommend the parallel random number generation feature by which we do not need to manually record the number of consumed seeds. see e.g. https://numpy.org/doc/stable/reference/random/parallel.html
It's unclear how to share arguments with PT when input is a
numpy.random.SeedSequence
, considering PT also uses several functions in dpmodel.
I just realized one can do this:
>>> import numpy as np
>>> rng = np.random.default_rng([1,2,3,4,5])
So it will work if we change the type of seed from int | None
to int | list[int] | None
.
Could you plz add a UT for mixed entropy to make sure the implementation is consistent with numpy.
No, it's not consistent, as PyTorch limits the seed to (-0xffff_ffff_ffff_ffff, 0xffff_ffff_ffff_ffff). It is only ensured that the seed generated by mix_entropy
has high entropy by using a similar way to NumPy.
Fix #3799.
Summary by CodeRabbit
New Features
Improvements