Open siyer32 opened 6 years ago
Here is Python 3.6.5 output: appdata['sa'] = cypd.to_ipaddress(appdata['sa']) appdata.dtypes sa ip da object sp int64 dp int64 ipkt int64 ibyt int64 Application Label object
We either need to make a better error message here, or break with Python 2's ipaddress
module.
In Python2, it expects unicode object when parsing a string IP Address like '192.168.1.1'
. https://cyberpandas.readthedocs.io/en/latest/usage.html#parsing
In [13]: import pandas as pd
In [14]: from cyberpandas import to_ipaddress
In [15]: df = pd.DataFrame({"addr": ['192.168.1.1', '192.168.1.2']})
In [16]: to_ipaddress(df.addr)
---------------------------------------------------------------------------
AddressValueError Traceback (most recent call last)
<ipython-input-16-1f1c4ac488eb> in <module>()
----> 1 to_ipaddress(df.addr)
/Users/taugspurger/sandbox/cyberpandas/cyberpandas/parser.py in to_ipaddress(values)
40 values = [values]
41
---> 42 return IPArray(_to_ip_array(values))
43
44
/Users/taugspurger/sandbox/cyberpandas/cyberpandas/parser.py in _to_ip_array(values)
59 elif not (isinstance(values, np.ndarray) and
60 values.dtype == IPType._record_type):
---> 61 values = _to_int_pairs(values)
62 return np.atleast_1d(np.asarray(values, dtype=IPType._record_type))
63
/Users/taugspurger/sandbox/cyberpandas/cyberpandas/parser.py in _to_int_pairs(values)
79 pass
80 else:
---> 81 values = [ipaddress.ip_address(v)._ip for v in values]
82 values = [unpack(pack(v)) for v in values]
83 return values
/Users/taugspurger/miniconda3/envs/py27-ipaddr/lib/python2.7/site-packages/ipaddress.pyc in ip_address(address)
163 '%r does not appear to be an IPv4 or IPv6 address. '
164 'Did you pass in a bytes (str in Python 2) instead of'
--> 165 ' a unicode object?' % address)
166
167 raise ValueError('%r does not appear to be an IPv4 or IPv6 address' %
AddressValueError: '192.168.1.1' does not appear to be an IPv4 or IPv6 address. Did you pass in a bytes (str in Python 2) instead of a unicode object?
In [17]: to_ipaddress(df.addr.astype(unicode))
Out[17]: IPArray([u'192.168.1.1', u'192.168.1.2'])
So in literal code it should be u'192.168.1.1'
instead of '192.168.1.1'
. The current way is pretty unfriendly :/
By definition, IP address strings have to be ASCII (unlike hostnames), so I don't see a problem with to_ipaddress
silently decoding Python 2 str to unicode assuming it is ASCII. Does that seem reasonable?
Does that seem reasonable?
Yeah, I think so.
Does this mean, the ip address passed have to be strings ? Most data (like the one I tested) that are captured from the devices are not strings.
There are other input methods describe in the docs. Python integers, or raw address in byte form (see IPArray.from_bytes
)
To clarify things, let's use Python 3's terminology. "string" is a unicode string, and "bytes" is a bytestring.
Most data (like the one I tested) that are captured from the devices are not strings.
What does the raw data look like for you? If performance is a concern, the absolute fastest was is https://cyberpandas.readthedocs.io/en/latest/api.html#cyberpandas.IPArray.from_bytes
Not sure if this is supported in Python 2.7.13, I got this error. Works fine in Python 3.6.5
appdata['sa'] = cypd.to_ipaddress(appdata['sa'])
63 '%r does not appear to be an IPv4 or IPv6 address. ' 164 'Did you pass in a bytes (str in Python 2) instead of' --> 165 ' a unicode object?' % address) 166 167 raise ValueError('%r does not appear to be an IPv4 or IPv6 address' %
AddressValueError: '10.44.129.135' does not appear to be an IPv4 or IPv6 address. Did you pass in a bytes (str in Python 2) instead of a unicode object?