Open jbarnoud opened 4 years ago
@tylerjereddy has been stalwartly working on reducing the warnings.
Do we have a statistic of which ones are the most frequent warnings so that we can make a targeted attempt to address the worst offenders first, e.g., anything from the PDBParser regarding atom types not found? We can then either address the root problems or include warnings filters to mute any warnings during a test that we know are harmless.
Would be nice if warnings meant something in tests. One idea is to change the common warnings we emit (i.e. MDAnalysis.GuessingWarning
) then have a pytest fixture applied to all tests that silences these warnings. I think this might address ~80% at least
I just had a very quick look and it seems that there are limitations to what types of warning you can ignore globally. From what I understand, there is a fixed list of warning types that can be ignored. For the other types, you would have to capture them on a per-test basis, which would probably be the cleanest but would not be tractable in some cases.
Another option is to ignore individual warnings by string. This would go in a config file, so it is doable.
Finally, it seems that more than half of the warnings are not about guessing, but about invalid values in numpy. By ignoring the RuntimeWarnings
I fall from ~30,000 to ~13,000 warnings.
Is there anything here that we could let prospective GSoC students do?
Ignoring numpy RuntimeWarnings would be a start.
So as may have been obvious from all the recent issues, I'm currently looking at the deprecation warnings so we don't lag too behind for the next release.
I've found that I can reduce the number of emitted warnings to a more manageable 382 with the setup.cfg
contents shown below.
The first ~ 3 RuntimeWarnings
(double_scalars
, true_divide
, and sqrt
) account for ~ 22k warnings being emitted.
About ~ 20k warnings are related to things like no dt information
, and Guessed all Masses
...
So the question here is, what do we mind being globally filtered?
Personally I think the UserWarnings
and DeprecationWarnings
I have down below are probably fine, it's all stuff that's either intentionally triggered or without any negative side-effect.
The numpy RuntimeWarnings
however seem like they could mask some bad happenings if we aren't careful with them? My personal view would be to have these be turned off via decorators at the individual test level - that way we can be sure we're only affecting the tests we know should throw those warnings?
setup.cfg contents:
[tool:pytest]
filterwarnings=
always
error:`np.*` is a deprecated alias for the builtin:DeprecationWarning
# don't enforce for third party packages though:
ignore:`np.*` is a deprecated alias for the builtin:DeprecationWarning:networkx.*:
error:Creating an ndarray from ragged nested sequences
error:invalid escape sequence \\:DeprecationWarning
# Ignore frequent errors popping up in our tests
# Elements
ignore:Element information:UserWarning
ignore:ATOMIC_NUMBER record not found, elements attribute will not be populated:UserWarning
ignore:Unknown element found for some atoms:UserWarning
ignore:Unknown elements found for some atoms:UserWarning
# Mass
ignore:Failed to guess the mass:UserWarning
ignore:Guessed all Masses to 1.0:UserWarning
# Coordinates
ignore:Reader has no dt information, set to 1.0 ps:UserWarning
ignore:No coordinate reader found:UserWarning
ignore:No dimensions set for current frame, zeroed unitcell will be written:UserWarning
ignore:Could not find netCDF4 module:UserWarning
ignore:No coordinates found in the RDKit molecule:UserWarning
ignore:Empty box:UserWarning
ignore:missing dimension - setting unit cell to zeroed box:UserWarning
ignore:Not all velocities were present. Unset velocities set to zero.:UserWarning
## PDB things
ignore:Unit cell dimensions not found. CRYST1 record:UserWarning
ignore:1 A\^3 CRYST1 record:UserWarning
ignore:The bfactor topology attribute is only provided as an alias:DeprecationWarning
# Topology
ignore:Residues specified but no atom_resindex given:UserWarning
ignore:Segments specified but no segment_resindex given:UserWarning
ignore: Found no information for attr:UserWarning
ignore:Supplied AtomGroup was missing the following attributes:UserWarning
ignore:Found missing ChainIDs:UserWarning
ignore:No hydrogen atom found in the topology:UserWarning
ignore:Unknown element X found for some atoms:UserWarning
ignore:Unknown ATOMIC_NUMBER value found for some atoms:UserWarning
# Converters
ignore:NaN detected in coordinates, the output molecule will not have 3D coordinates assigned:UserWarning
ignore:Both _GasteigerCharge and _TriposPartialCharge properties are present:UserWarning
ignore:No `bonds` attribute in this AtomGroup. Guessing bonds based on atoms coordinates:UserWarning
# NamedStream warnings
ignore:Constructed NamedStream:RuntimeWarning
# Selections
ignore:Using float equality to select atoms is not recommended
# Aux
ignore:AuxStep does not support time selection:UserWarning
ignore:AuxStep does not support data selection:UserWarning
# Analysis
ignore:The `hbonds` attribute was deprecated in MDAnalysis:DeprecationWarning
ignore:`HydrogenBondAutoCorrel` is deprecated:DeprecationWarning
ignore:The `p_components` attribute was deprecated in MDAnalysis 2.0.0:DeprecationWarning
ignore:The `rmsf` attribute was deprecated in MDAnalysis 2.0.0:DeprecationWarning
ignore:This module was moved to MDAnalysis:DeprecationWarning
ignore:Box padding \(currently set at 2.0\):UserWarning
ignore:Atom selection does not fit grid:UserWarning
ignore:Desired number of sample points too high, using 6:RuntimeWarning
ignore:The `rmsd` attribute was deprecated in MDAnalysis 2.0.0:DeprecationWarning
ignore:Empty string to select atoms, empty group returned:UserWarning
# RuntimeWarnings
ignore:invalid value encountered in double_scalars:RuntimeWarning
ignore:invalid value encountered in true_divide:RuntimeWarning
ignore:invalid value encountered in sqrt:RuntimeWarning
ignore:invalid value encountered in float_scalars:RuntimeWarning
That's an excellent step forward in making the test output sane(er)!
Filtering pretty much everything that we generate is good. Presumably, our tests can still catch them if we want to test for a specific warning?
Pinpointing the numpy ones to where we generate them and understanding why they are generated is important to make sure that we are actually producing correct results. I like your idea of quieting them on a per-test basis together with a comment explaining why. This will produce a permanent record of our understanding.
(And yes, you convinced me to change my opinion https://github.com/MDAnalysis/mdanalysis/issues/2980#issuecomment-798989249 to "ignore NumPy runtime warnings" ;-) ).
I like silences the errors per test.
I like silences the errors per test.
Both the expected MDA ones and the external ones?
Ideally yes. That would be the safest. That would also take quite some time, even if I expect few tests and fixtures to generate most of the warnings. On 15 Feb 2022 18:43, Irfan Alibay @.***> wrote:
I like silences the errors per test.
Both the expected MDA ones and the external ones?
—Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: @.***>
Ideally yes. That would be the safest. That would also take quite some time, even if I expect few tests and fixtures to generate most of the warnings.
re: "expect few tests and fixtures" -> after multiple hours spent here, that's not really the case imho. You probably are in the order of max hundreds per tests when you're looking at tens of thousands of warnings overall. I've already spent a few hours on this, it's not a short term issue.
I will say two things here -
1) We really need to come to an agreement here, I tried starting doing per-test fixtures in PRs and was told that this wasn't worth it. I'd prefer not have to go back and forth on this. @MDAnalysis/coredevs can we have everyone pitching in towards consensus?
2) This is extremely time consuming - we're talking about thousands of tests which will require an in-depth look at. There are MDA UserWarnings we throw out there that really don't matter (say element stuff, or dt
being placed to 1 ps). Given how many other issues we have to deal with - what are the gains and are they sufficient enough for them to be worth our time?
(Also probably a third point here - we need to be clear on why we're doing this at the test level. If it's to check the validity of each test then we probably shouldn't make this an issue we throw at prospective GSoC contributors).
I was only thinking of the numpy runtime warnings
ignore:invalid value encountered in double_scalars:RuntimeWarning
ignore:invalid value encountered in true_divide:RuntimeWarning
ignore:invalid value encountered in sqrt:RuntimeWarning
ignore:invalid value encountered in float_scalars:RuntimeWarning
and see if they come from a small number of our tests because they might hint at issues in algorithms. If feasible, mute these few (maybe not few???) on a per-test basis.
How many of our tests generate these warnings above?
I would be ok with a broad-brush silencing for most everything else.
Those account for ~ 20k warnings split, maybe split across a hundred or so tests?
It's manageable and I'd advocate doing the non-MDA tests manually. It's going to be time consuming either way tbh.
Expected behavior
Running the tests issues no or only a few warnings that can be displayed and, ideally, addressed.
Actual behavior
Running the tests issues more than 30,000 warnings that are untracktable to look at, meaning we can very easily miss important warnings.
Code to reproduce the behavior
Just run the tests
Current version of MDAnalysis
python -c "import MDAnalysis as mda; print(mda.__version__)"
) 2.0.0-dev0python -V
)? Python 3.9.0