zrnsm / pyculiarity

A Python port of Twitter's AnomalyDetection R Package
GNU General Public License v3.0
364 stars 146 forks source link

Possible date conversion error? #1

Open arizhakov opened 9 years ago

arizhakov commented 9 years ago

Hello,

I am trying to run the example that you have posted on the main page, ie:


from pyculiarity import detect_ts import pandas as pd twitter_example_data = pd.read_csv('raw_data.csv', usecols=['timestamp', 'count']) results = detect_ts(twitter_example_data, max_anoms=0.02, direction='both', only_last='day')


but unfortunately, i get an error (please see below):


Traceback (most recent call last): File "", line 1, in File "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 540, in runfile execfile(filename, namespace) File "C:/Users/user1/Current projects//test/test_pyculiarity_0.py", line 7, in direction='both', only_last='day') File "build\bdist.win32\egg\pyculiarity\detect_ts.py", line 142, in detect_ts gran = get_gran(df) File "build\bdist.win32\egg\pyculiarity\date_utils.py", line 46, in get_gran gran = int(round((largest - second_largest) / np.timedelta64(1, 's'))) TypeError: ufunc divide cannot use operands with types dtype('O') and dtype('<m8[s]')


i have traced the issue best that i could, and it appears that there is an error with the "largest - second_largest" step:


14395 1980-10-05 13:56:00 14396 1980-10-05 13:57:00 14397 1980-10-05 13:58:00 Name: timestamp, Length: 14398, dtype: object nlargest(2, col): ['1980-10-05 13:58:00', '1980-10-05 13:57:00'] largest: 1980-10-05 13:58:00 second_largest: 1980-10-05 13:57:00 (largest - second_largest): Traceback (most recent call last): File "", line 1, in File "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 540, in runfile execfile(filename, namespace) File "C:/Users/user1/Current projects//test/test_pyculiarity_0.py", line 17, in print "(largest - second_largest): ", (largest - second_largest) TypeError: unsupported operand type(s) for -: 'str' and 'str'


when i ran the "nosetests ." command, all 13 tests failed with the same TyperError.

any suggestions? Please advise, and i appreciate your time in advance. Thank you.

PS: sorry if my formatting, etc. is odd and not standard. this is my first issue posting. i can edit any clarifications you need.

zrnsm commented 9 years ago

Thanks for taking the time to open an issue! Can you give me more information about your environment? That will help me diagnose the problem. Specifically: the versions of your operating system, Python, pyculiarity, its dependencies (SciPy, NumPy, R, nose, pip), etc. Also, how did you install the package?The more detail the better.

arizhakov commented 9 years ago

Sure thing, would love to help on this.

I started the install by downloading the .zip file (right side of https://github.com/nicolasmiller/pyculiarity), and unzipped.

To answer your questions: OS: Win 7 enterprise Python: 2.7.6 pyculiarity: 0.0.1a3 SciPy: 0.14.0 NumPy: 1.8.1 R: 3.2.1 (2015-06-18) nose: 1.3.3 pip: N/A. firewall issues at work, so have to install all my packages manually :(

Please let me know if you need more info. I have a log.txt that shows all my steps, though do not know how to attach to this comment; If you would like to see, please let me know how to send to you/post here.

I am going to play around with etsy/skyline for my current project, but would love to come back to pyculiarity to supplement.

Thanks!

zrnsm commented 9 years ago

Ah, I should have noticed Win32 in the stacktrace. The package has been tested exclusively in *nix environments, both OS X and Linux (this should be more explicit in the documentation). I'll need to scrounge up a Windows box to test on. I should be able to take a look later this evening.

arizhakov commented 9 years ago

Thanks.

As a theoretical question, why would the OS environment matter for TypeErrors? To me, this sounds like an environment-independent issue dealing with a math operation on strings, rather than ints/floats, no?

zrnsm commented 9 years ago

I would tend to agree with you just looking at the stacktrace. But I know the example works just fine in other environments. There might be some slightly different behavior happening in your versions of the dependencies for example. I want to see if I can reproduce your exact problem and then work backwards from there.

Oh, can you give me your versions of pandas, pytz, statsmodels and rpy2 as well?

zrnsm commented 9 years ago

But, yes, it looks like we do end up with string dates where the code is expecting actual date objects that can be subtracted. So for some reason the normal conversion that should happen before we reach that point is failing. I'll keep you posted with what I find once I have a machine to debug with.

In the meantime, it might be worth tracing through what's happening in your environment around the format_timestamp function to understand why exactly the conversion from a date string isn't happening.

arizhakov commented 9 years ago

pandas: 0.13.1 pytz: 2014.3 statsmodels: 0.5.0 rpy2: 2.6.1

arizhakov commented 9 years ago

perhaps this is overkill, but here is the log. Feel free to delete this comment if excessive :)


Microsoft Windows [Version 6.1.7601] Copyright (c) 2009 Microsoft Corporation. All rights reserved.

C:\Python27\Lib\site-packages\pyculiarity-master> C:\Python27\Lib\site-packages\pyculiarity-master> C:\Python27\Lib\site-packages\pyculiarity-master>python install setup.py python: can't open file 'install': [Errno 2] No such file or directory

C:\Python27\Lib\site-packages\pyculiarity-master> C:\Python27\Lib\site-packages\pyculiarity-master>python setup.py install running install running bdist_egg running egg_info creating pyculiarity.egg-info writing requirements to pyculiarity.egg-info\requires.txt writing pyculiarity.egg-info\PKG-INFO writing top-level names to pyculiarity.egg-info\top_level.txt writing dependency_links to pyculiarity.egg-info\dependency_links.txt writing manifest file 'pyculiarity.egg-info\SOURCES.txt' reading manifest file 'pyculiarity.egg-info\SOURCES.txt' reading manifest template 'MANIFEST.in' warning: no previously-included files matching '*.pyc' found under directory 'tests' writing manifest file 'pyculiarity.egg-info\SOURCES.txt' installing library code to build\bdist.win32\egg running install_lib running build_py creating build creating build\lib creating build\lib\pyculiarity copying pyculiarity\date_utils.py -> build\lib\pyculiarity copying pyculiarity\detect_anoms.py -> build\lib\pyculiarity copying pyculiarity\detect_ts.py -> build\lib\pyculiarity copying pyculiarity\detect_vec.py -> build\lib\pyculiarity copying pyculiarity\r_stl.py -> build\lib\pyculiarity copying pyculiarityinit.py -> build\lib\pyculiarity creating build\bdist.win32 creating build\bdist.win32\egg creating build\bdist.win32\egg\pyculiarity copying build\lib\pyculiarity\date_utils.py -> build\bdist.win32\egg\pyculiarity copying build\lib\pyculiarity\detect_anoms.py -> build\bdist.win32\egg\pyculiarity copying build\lib\pyculiarity\detect_ts.py -> build\bdist.win32\egg\pyculiarity copying build\lib\pyculiarity\detect_vec.py -> build\bdist.win32\egg\pyculiarity copying build\lib\pyculiarity\r_stl.py -> build\bdist.win32\egg\pyculiarity copying build\lib\pyculiarityinit.py -> build\bdist.win32\egg\pyculiarity byte-compiling build\bdist.win32\egg\pyculiarity\date_utils.py to date_utils.pyc byte-compiling build\bdist.win32\egg\pyculiarity\detect_anoms.py to detect_anoms.pyc byte-compiling build\bdist.win32\egg\pyculiarity\detect_ts.py to detect_ts.pyc byte-compiling build\bdist.win32\egg\pyculiarity\detect_vec.py to detect_vec.pyc byte-compiling build\bdist.win32\egg\pyculiarity\r_stl.py to r_stl.pyc byte-compiling build\bdist.win32\egg\pyculiarityinit.py to init.pyc creating build\bdist.win32\egg\EGG-INFO copying pyculiarity.egg-info\PKG-INFO -> build\bdist.win32\egg\EGG-INFO copying pyculiarity.egg-info\SOURCES.txt -> build\bdist.win32\egg\EGG-INFO copying pyculiarity.egg-info\dependency_links.txt -> build\bdist.win32\egg\EGG-INFO copying pyculiarity.egg-info\requires.txt -> build\bdist.win32\egg\EGG-INFO copying pyculiarity.egg-info\top_level.txt -> build\bdist.win32\egg\EGG-INFO zip_safe flag not set; analyzing archive contents... creating dist creating 'dist\pyculiarity-0.0.1a3-py2.7.egg' and adding 'build\bdist.win32\egg' to it removing 'build\bdist.win32\egg' (and everything under it) Processing pyculiarity-0.0.1a3-py2.7.egg Copying pyculiarity-0.0.1a3-py2.7.egg to c:\python27\lib\site-packages Adding pyculiarity 0.0.1a3 to easy-install.pth file

Installed c:\python27\lib\site-packages\pyculiarity-0.0.1a3-py2.7.egg Processing dependencies for pyculiarity==0.0.1a3 Searching for rpy2==2.6.1 Best match: rpy2 2.6.1 Adding rpy2 2.6.1 to easy-install.pth file

Using c:\python27\lib\site-packages Searching for statsmodels==0.5.0 Best match: statsmodels 0.5.0 Adding statsmodels 0.5.0 to easy-install.pth file

Using c:\python27\lib\site-packages Searching for pytz==2014.3 Best match: pytz 2014.3 Adding pytz 2014.3 to easy-install.pth file

Using c:\python27\lib\site-packages Searching for pandas==0.13.1 Best match: pandas 0.13.1 Adding pandas 0.13.1 to easy-install.pth file

Using c:\python27\lib\site-packages Searching for scipy==0.14.0 Best match: scipy 0.14.0 Adding scipy 0.14.0 to easy-install.pth file

Using c:\python27\lib\site-packages Searching for numpy==1.8.1 Best match: numpy 1.8.1 Adding numpy 1.8.1 to easy-install.pth file

Using c:\python27\lib\site-packages Searching for six==1.6.1 Best match: six 1.6.1 Adding six 1.6.1 to easy-install.pth file

Using c:\python27\lib\site-packages Searching for singledispatch==3.4.0.3 Best match: singledispatch 3.4.0.3 Adding singledispatch 3.4.0.3 to easy-install.pth file

Using c:\python27\lib\site-packages Searching for python-dateutil==2.2 Best match: python-dateutil 2.2 Adding python-dateutil 2.2 to easy-install.pth file

Using c:\python27\lib\site-packages Finished processing dependencies for pyculiarity==0.0.1a3

C:\Python27\Lib\site-packages\pyculiarity-master> C:\Python27\Lib\site-packages\pyculiarity-master> C:\Python27\Lib\site-packages\pyculiarity-master> C:\Python27\Lib\site-packages\pyculiarity-master> C:\Python27\Lib\site-packages\pyculiarity-master> C:\Python27\Lib\site-packages\pyculiarity-master> C:\Python27\Lib\site-packages\pyculiarity-master>nosetests .

EEEEEEEEEEEEE

ERROR: test_check_constant_series (test_edge.TestEdge)


Traceback (most recent call last): File "C:\Python27\Lib\site-packages\pyculiarity-master\tests\test_edge.py", line 15, in test_check_constant_series results = detect_vec(s, period=14, direction='both', plot=False) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_vec.py", line 188, in detect_vec verbose=verbose) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_anoms.py", line 70, in detect_anoms decomp = stl(data.value, "periodic", np=num_obs_per_period) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\rstl.py", line 42, in stl ts = ts(robjects.FloatVector(asarray(data)), start=start, frequency=np) File "C:\Python27\lib\site-packages\rpy2\robjects\functions.py", line 178, in call return super(SignatureTranslatedFunction, self).call(_args, _kwargs) File "C:\Python27\lib\site-packages\rpy2\robjects\functions.py", line 105, in call new_kwargs[k] = conversion.py2ri(v) File "C:\Python27\lib\site-packages\singledispatch.py", line 210, in wrapper return dispatch(args[0].class)(_args, _kw) File "C:\Python27\lib\site-packages\rpy2\robjects\conversion.py", line 39, in py2ri raise NotImplementedError("Conversion 'py2ri' not defined for objects of type '%s'" % str(type(obj))) NotImplementedError: Conversion 'py2ri' not defined for objects of type '<type 'numpy.int64'>'

ERROR: test_check_midnight_date_format (test_edge.TestEdge)


Traceback (most recent call last): File "C:\Python27\Lib\site-packages\pyculiarity-master\tests\test_edge.py", line 28, in test_check_midnight_date_format e_value=True) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_ts.py", line 142, in detect_ts gran = get_gran(df) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\date_utils.py", line 46, in get_gran gran = int(round((largest - second_largest) / np.timedelta64(1, 's'))) TypeError: ufunc divide cannot use operands with types dtype('O') and dtype('<m8[s]')

ERROR: test_handling_of_leading_trailing_nas (test_na.TestNAs)


Traceback (most recent call last): File "C:\Python27\Lib\site-packages\pyculiarity-master\tests\test_na.py", line 21, in test_handling_of_leading_trailing_nas direction='both', plot=False) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_ts.py", line 142, in detect_ts gran = get_gran(df) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\date_utils.py", line 46, in get_gran gran = int(round((largest - second_largest) / np.timedelta64(1, 's'))) TypeError: ufunc divide cannot use operands with types dtype('O') and dtype('<m8[s]')

ERROR: test_handling_of_middle_nas (test_na.TestNAs)


Traceback (most recent call last): File "C:\Python27\lib\site-packages\nose\tools\nontrivial.py", line 60, in newfunc func(_arg, *_kw) File "C:\Python27\Lib\site-packages\pyculiarity-master\tests\test_na.py", line 28, in test_handling_of_middle_nas detect_ts(self.raw_data, max_anoms=0.02, direction='both') File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_ts.py", line 142, in detect_ts gran = get_gran(df) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\date_utils.py", line 46, in get_gran gran = int(round((largest - second_largest) / np.timedelta64(1, 's'))) TypeError: ufunc divide cannot use operands with types dtype('O') and dtype('<m8[s]')

ERROR: test_both_directions_e_value_longterm (test_ts.TestTS)


Traceback (most recent call last): File "C:\Python27\Lib\site-packages\pyculiarity-master\tests\test_ts.py", line 24, in test_both_directions_e_value_longterm plot=False, e_value=True) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_ts.py", line 142, in detect_ts gran = get_gran(df) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\date_utils.py", line 46, in get_gran gran = int(round((largest - second_largest) / np.timedelta64(1, 's'))) TypeError: ufunc divide cannot use operands with types dtype('O') and dtype('<m8[s]')

ERROR: test_both_directions_e_value_threshold_med_max (test_ts.TestTS)


Traceback (most recent call last): File "C:\Python27\Lib\site-packages\pyculiarity-master\tests\test_ts.py", line 32, in test_both_directions_e_value_threshold_med_max e_value=True) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_ts.py", line 142, in detect_ts gran = get_gran(df) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\date_utils.py", line 46, in get_gran gran = int(round((largest - second_largest) / np.timedelta64(1, 's'))) TypeError: ufunc divide cannot use operands with types dtype('O') and dtype('<m8[s]')

ERROR: test_both_directions_with_plot (test_ts.TestTS)


Traceback (most recent call last): File "C:\Python27\Lib\site-packages\pyculiarity-master\tests\test_ts.py", line 17, in test_both_directions_with_plot plot=False) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_ts.py", line 142, in detect_ts gran = get_gran(df) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\date_utils.py", line 46, in get_gran gran = int(round((largest - second_largest) / np.timedelta64(1, 's'))) TypeError: ufunc divide cannot use operands with types dtype('O') and dtype('<m8[s]')

ERROR: test_both_directions_e_value_longterm (test_ts_AR.TestTS)


Traceback (most recent call last): File "C:\Python27\Lib\site-packages\pyculiarity-master\tests\test_ts_AR.py", line 24, in test_both_directions_e_value_longterm plot=False, e_value=True) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_ts.py", line 142, in detect_ts gran = get_gran(df) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\date_utils.py", line 46, in get_gran gran = int(round((largest - second_largest) / np.timedelta64(1, 's'))) TypeError: ufunc divide cannot use operands with types dtype('O') and dtype('<m8[s]')

ERROR: test_both_directions_e_value_threshold_med_max (test_ts_AR.TestTS)


Traceback (most recent call last): File "C:\Python27\Lib\site-packages\pyculiarity-master\tests\test_ts_AR.py", line 32, in test_both_directions_e_value_threshold_med_max e_value=True) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_ts.py", line 142, in detect_ts gran = get_gran(df) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\date_utils.py", line 46, in get_gran gran = int(round((largest - second_largest) / np.timedelta64(1, 's'))) TypeError: ufunc divide cannot use operands with types dtype('O') and dtype('<m8[s]')

ERROR: test_both_directions_with_plot (test_ts_AR.TestTS)


Traceback (most recent call last): File "C:\Python27\Lib\site-packages\pyculiarity-master\tests\test_ts_AR.py", line 17, in test_both_directions_with_plot plot=True) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_ts.py", line 142, in detect_ts gran = get_gran(df) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\date_utils.py", line 46, in get_gran gran = int(round((largest - second_largest) / np.timedelta64(1, 's'))) TypeError: ufunc divide cannot use operands with types dtype('O') and dtype('<m8[s]')

ERROR: test_both_directions_e_value_longterm (test_vec.TestVec)


Traceback (most recent call last): File "C:\Python27\Lib\site-packages\pyculiarity-master\tests\test_vec.py", line 24, in test_both_directions_e_value_longterm longterm_period=1440_14, e_value=True) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_vec.py", line 188, in detect_vec verbose=verbose) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_anoms.py", line 70, in detect_anoms decomp = stl(data.value, "periodic", np=num_obs_per_period) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\rstl.py", line 42, in stl ts = ts(robjects.FloatVector(asarray(data)), start=start, frequency=np) File "C:\Python27\lib\site-packages\rpy2\robjects\functions.py", line 178, in call return super(SignatureTranslatedFunction, self).call(_args, *_kwargs) File "C:\Python27\lib\site-packages\rpy2\robjects\functions.py", line 105, in call new_kwargs[k] = conversion.py2ri(v) File "C:\Python27\lib\site-packages\singledispatch.py", line 210, in wrapper return dispatch(args[0].class)(_args, **kw) File "C:\Python27\lib\site-packages\rpy2\robjects\conversion.py", line 39, in py2ri raise NotImplementedError("Conversion 'py2ri' not defined for objects of type '%s'" % str(type(obj))) NotImplementedError: Conversion 'py2ri' not defined for objects of type '<type 'numpy.int64'>'

ERROR: test_both_directions_e_value_threshold_med_max (test_vec.TestVec)


Traceback (most recent call last): File "C:\Python27\Lib\site-packages\pyculiarity-master\tests\test_vec.py", line 31, in test_both_directions_e_value_threshold_med_max threshold="med_max", e_value=True) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_vec.py", line 188, in detect_vec verbose=verbose) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_anoms.py", line 70, in detect_anoms decomp = stl(data.value, "periodic", np=num_obs_per_period) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\rstl.py", line 42, in stl ts = ts(robjects.FloatVector(asarray(data)), start=start, frequency=np) File "C:\Python27\lib\site-packages\rpy2\robjects\functions.py", line 178, in call return super(SignatureTranslatedFunction, self).call(_args, _kwargs) File "C:\Python27\lib\site-packages\rpy2\robjects\functions.py", line 105, in call new_kwargs[k] = conversion.py2ri(v) File "C:\Python27\lib\site-packages\singledispatch.py", line 210, in wrapper return dispatch(args[0].class)(_args, _kw) File "C:\Python27\lib\site-packages\rpy2\robjects\conversion.py", line 39, in py2ri raise NotImplementedError("Conversion 'py2ri' not defined for objects of type '%s'" % str(type(obj))) NotImplementedError: Conversion 'py2ri' not defined for objects of type '<type 'numpy.int64'>'

ERROR: test_both_directions_with_plot (test_vec.TestVec)


Traceback (most recent call last): File "C:\Python27\Lib\site-packages\pyculiarity-master\tests\test_vec.py", line 17, in test_both_directions_with_plot only_last=True, plot=False) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_vec.py", line 188, in detect_vec verbose=verbose) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\detect_anoms.py", line 70, in detect_anoms decomp = stl(data.value, "periodic", np=num_obs_per_period) File "C:\Python27\Lib\site-packages\pyculiarity-master\pyculiarity\rstl.py", line 42, in stl ts = ts(robjects.FloatVector(asarray(data)), start=start, frequency=np) File "C:\Python27\lib\site-packages\rpy2\robjects\functions.py", line 178, in call return super(SignatureTranslatedFunction, self).call(_args, _kwargs) File "C:\Python27\lib\site-packages\rpy2\robjects\functions.py", line 105, in call new_kwargs[k] = conversion.py2ri(v) File "C:\Python27\lib\site-packages\singledispatch.py", line 210, in wrapper return dispatch(args[0].class)(_args, _kw) File "C:\Python27\lib\site-packages\rpy2\robjects\conversion.py", line 39, in py2ri raise NotImplementedError("Conversion 'py2ri' not defined for objects of type '%s'" % str(type(obj))) NotImplementedError: Conversion 'py2ri' not defined for objects of type '<type 'numpy.int64'>'


Ran 13 tests in 6.751s

FAILED (errors=13)

C:\Python27\Lib\site-packages\pyculiarity-master> C:\Python27\Lib\site-packages\pyculiarity-master> C:\Python27\Lib\site-packages\pyculiarity-master> C:\Python27\Lib\site-packages\pyculiarity-master> C:\Python27\Lib\site-packages\pyculiarity-master>

zrnsm commented 9 years ago

Excellent! This really helps.

zrnsm commented 9 years ago

I managed to get a Windows environment up on my Mac under Parallels and have reproduced the error. Working on a fix now.

Another thing occurred to me: are you on 32-bit or 64-bit Windows? And are you using 32 or 64 bit versions of the python dependencies?

arizhakov commented 9 years ago

Glad you could reproduce and that it wasn't some random issue with my configuration. Progress!

i am on 64-bit Windows.

I am not sure about 32 vs 64 bit Python dependencies; how can I check? I am running my scripts out of Python(x.y), FYI.

arizhakov commented 9 years ago

I will also dig into the issue today on my end, and see what i can find. Please let me know how i can assist. Looking forward to getting this up and running.

zrnsm commented 9 years ago

I managed to get it working on Windows with all of your versions of the dependencies save one: statsmodels. I needed to upgrade to 0.6.1 from 0.5.0. Between those versions, there were changes made to statsmodels.robust.scale.mad (median absolute deviation) which we use in detect_anoms. This was throwing results off slightly. I had to make a couple of other minor changes to accommodate for some other differences with some of the numpy and pandas types. Take a look at the commit history if you're interested.

Long story short: grab the latest code from master and upgrade to at least statsmodels 0.6.1 and give it a shot. If you're stuck with 0.5.0 for some reason, you'll need to replace the mad function in detect_anoms with something that is consistent with what 0.6.1's mad returns.

arizhakov commented 9 years ago

Excellent!! Nice work! I will give this a shot today, and reply with my results.

arizhakov commented 9 years ago

@nicolasmiller, I am still having issues :(. As instructed, I grabbed the latest statsmodels and pyculiarity codes, installed, getting similar (maybe same) error as before. When I check versions, I get:


import statsmodels print statsmodels.version (sorry, the bold is supposed to be 2 underscores) 0.6.1 import pyculiarity

print pyculiarity.version (sorry, the bold is supposed to be 2 underscores) 0.0.2


i have tried a simple example, but it gives me an error:


print "(largest - second_largest): ", (np.timedelta64(largest) - np.timedelta64(second_largest)) ValueError: Could not convert object to NumPy timedelta


however,


largest1 '1980-10-05' second_largest2 '1980-10-01'

np.timedelta64(largest1-second_largest2) Traceback (most recent call last): File "", line 1, in TypeError: unsupported operand type(s) for -: 'str' and 'str' np.datetime64(largest1) - np.datetime64(second_largest2) numpy.timedelta64(4,'D')


I am not sure of what this indicates. I think that it has something to do with the string including the days and hour/min/sec all in one string, which perhaps numpy doesn't like.

I will think more tomorrow, but just wanted to give you a heads up. Any suggestions what I am doing wrong is appreciated.

zrnsm commented 9 years ago

Sorry to hear that. I'll take another look. There must still be something different about our environments and dependencies, etc. Can you double check all of the version numbers and confirm that they're still the same as what you listed before?

arizhakov commented 9 years ago

oh no, sorry about the late reply - I am just seeing this now. I should have some time tomorrow to send you the info that you requested above, as well as any diagnostics that I can come up with. Additionally, I will test on my home Windows machine (Windows 10) to see if this is just some strange occurrence on my production machine. Thanks for your efforts!