stephenslab / dsc

Repo for Dynamic Statistical Comparisons project
https://stephenslab.github.io/dsc-wiki
MIT License
12 stars 12 forks source link

Suppress numpy warnings from conda version of dsc-query #159

Closed jdblischak closed 4 years ago

jdblischak commented 5 years ago

To reduce the number of conda binaries per Python package, the recipes are built using an older version of numpy that is forwards compatible (https://github.com/ContinuumIO/anaconda-issues/issues/6678#issuecomment-337291250 & https://github.com/ContinuumIO/anaconda-issues/issues/6678#issuecomment-337302945). The downside is that this creates an unnecessary warning. It can be safely ignored (https://github.com/ContinuumIO/anaconda-issues/issues/6678#issuecomment-337276215), but still may worry end users.

(test-dscr) $ dsc --version
0.3.1.2
(test-dscr) $ dsc-query --version
/home/jdb-work/miniconda3-conda-forge/envs/test-dscr/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
0.3.1.2

It should be possible to suppress this warning using filterwarnings, e.g. as was done in https://github.com/deeptools/HiCExplorer/commit/eec659aa6701b97bd60e363d533174a47d32a8e1.

gaow commented 5 years ago

Thanks @jdblischak are you suggesting simply:

diff --git a/src/query_engine.py b/src/query_engine.py
index 230d0f8..beb538a 100644
--- a/src/query_engine.py
+++ b/src/query_engine.py
@@ -5,6 +5,9 @@ __email__ = "gaow@uchicago.edu"
 __license__ = "MIT"
 import os, re, pickle
 import pandas as pd, numpy as np
+import warnings
+warnings.filterwarnings("ignore", message="numpy.dtype size changed")
+warnings.filterwarnings("ignore", message="numpy.ndarray size changed")
 from .utils import uniq_list, case_insensitive_uniq_list, flatten_list, filter_sublist, FormatError, DBError, logger
 from .yhat_sqldf import sqldf
 from .line import parse_filter

? this is to match the exact warning message.

Alternatively, I'm thinking it should be less adhoc if we have a minimum version requirement for DSC's numpy on the recipe such that it does not generate the warning to begin with. What would be the drawback of doing that?

jdblischak commented 5 years ago

are you suggesting simply:

Basically yes. I think the code to filter the warnings might need to be inserted prior to importing numpy, but I don't know enough about how the error is generated to know without testing it out first.

Alternatively, I'm thinking it should be less adhoc if we have a minimum version requirement for DSC's numpy on the recipe such that it does not generate the warning to begin with. What would be the drawback of doing that?

The drawback would be that it wouldn't be able to be built as easily for conda. Remember, this is just a warning. Nothing is wrong with using an older version of numpy. I figured that if adding a few extra lines of code eliminates an unnecessary warning, then it'd be worth doing.

jdblischak commented 5 years ago

Commit ccc7186e46786903a235f24469f5cc5c7e095180 didn't remove the warning. The conda package for DSC 0.3.2 has the same behavior:

$ conda install dsc==0.3.2
$ dsc-query --version
/home/jdb-work/miniconda3-conda-forge/envs/dsc/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/home/jdb-work/miniconda3-conda-forge/envs/dsc/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
0.3.2
gaow commented 5 years ago

Too bad. Apparently these messages are not enough to filter it. I've removed those lines in a patch I sent in just now. I'll have to be able to reproduce the behavior first (in some conda environment) then try to fix it, if we are not restricting version requirements for numpy. Let's keep this ticket still open for now.

gaow commented 5 years ago

In fact, the current default version of numpy on travis is triggering errors that fails the unit tests:

https://travis-ci.org/stephenslab/dsc/builds/479218621#L1299

We might want to look into that issue with this ticket collectively ...

jdblischak commented 5 years ago

In fact, the current default version of numpy on travis is triggering errors that fails the unit tests:

pip installed the latest version from PyPI, which is currently 1.16.0.

https://travis-ci.org/stephenslab/dsc/builds/479218621#L905 https://pypi.org/project/numpy/

I don't understand why that is causing an issue though. I just ran the tests locally with numpy 1.16.0 and they passed.

gaow commented 5 years ago

thanks @jdblischak for testing it out on your local machine. Yes looks like this needs some digging into ... i do not have a clue for now.

gaow commented 5 years ago

I tried to give another stab at this but to no veil. I was not able to reproduce the issue using a freshly installed conda environment. This is what I get:

[GW] python test_query.py 
..
----------------------------------------------------------------------
Ran 2 tests in 0.376s
OK
[GW] python test_parser.py 
.................
----------------------------------------------------------------------
Ran 17 tests in 4.544s
OK
[GW] pip list numpy | grep numpy
numpy                             1.16.0    
[GW] dsc-query --version
0.3.2

So the first step is still to reproduce it ... likely using Ubuntu stock pip (not from conda)? Then fix it? It'd be a bit silly to work hard at the problem if the problem is numpy distribution issue. Another way to make travis work is to instead set up a conda to the Ubuntu VM it uses. That should certainly solve the problem given what I've learned on my machine, but also feels a bit awkward.

Notice the release of numpy 1.16.0 was 19 hours ago about the same time our travis failed. There is this post from 16 hours ago:

https://teratail.com/questions/168704

Their solution was basically to reinstall everything (not clear to me via pip or conda although the test environemnt was conda). Maybe this is a new artifact and we should just wait it out for a brown-paper-bag patch on numpy?

BTW in here https://pypi.org/project/numpy apparently the new release provides "a powerful N-dimensional array object" -- indeed!

jdblischak commented 5 years ago

Another way to make travis work is to instead set up a conda to the Ubuntu VM it uses. That should certainly solve the problem given what I've learned on my machine, but also feels a bit awkward.

An easier solution for now would be to pin numpy to 0.15.0 in requirements.txt.

Maybe this is a new artifact and we should just wait it out for a brown-paper-bag patch on numpy?

Good point. The timing suspicious. I agree we shouldn't do anything too drastic on our end until the new numpy release has been tested in the wild longer.

gaow commented 5 years ago

Good pointer! Travis now passes again after I pin numpy to version 1.15.4. It should not harm since we do not document to the public to install via requirement.txt and whoever knows what it is for should be sophisticated enough to fix any issues on their own. I guess we'd have to have some minimal version requirement for numpy, after this issue was fixed!

BTW if you look at the travis build log there is no warning on dsc-query any more.

jdblischak commented 5 years ago

BTW if you look at the travis build log there is no warning on dsc-query any more.

Did you mean for this repo or the dsc-feedstock Travis log? The dsc-query warnings come from the conda package, so I don't expect them from the source builds for this repo.

However, I'm having trouble replicating the previous warning message. I tried re-creating the same environment I had for the original post, which was an environment that had my conda package of dscrutils.

conda create -y -n test-dscr -c jdblischak r-dscrutils

I have updated conda since then, but nothing in the changelog suggests this behavior would be affected. Also, I tested in a Docker container that I had previously created and didn't get the warning (which would have had an older version of conda).

@gaow Have you ever seen the numpy warning with the conda package version of dsc-query?

gaow commented 5 years ago

@jdblischak thanks for checking it! No I do not have an issue with my conda version but possibly this is because my numpy is up to date. I never tried to create a fresh conda environment to test it if that is what you are asking. So I'm as confused as you are why it is no longer raising it.

Did you mean for this repo or the dsc-feedstock Travis log?

I mean to say for the travis log:

https://travis-ci.org/stephenslab/dsc/jobs/479643609#L901

which still does not have the issue.

My money is still on numpy versions; earlier I was inclined to have some minimum numpy version requirement but now I'm not so sure given the issue we've seen with numpy 0.16.0 ...