retentioneering / retentioneering-tools

Retentioneering: product analytics, data-driven CJM optimization, marketing analytics, web analytics, transaction analytics, graph visualization, process mining, and behavioral segmentation in Python. Predictive analytics over clickstream, AB tests, machine learning, and Markov Chain simulations.
https://doc.retentioneering.com/stable/doc/index.html
Other
798 stars 122 forks source link

Loading custom data #38

Closed abelstam12 closed 1 year ago

abelstam12 commented 4 years ago

Hi im having trouble applying the basic plot_graph example to my own dataset. I can reproduce the step matric example, but when trying to display the plot_graph, im getting the following exception

`

KeyError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2894 try: -> 2895 return self._engine.get_loc(casted_key) 2896 except KeyError as err:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'type'

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/pandas/core/generic.py in _set_item(self, key, value) 3573 try: -> 3574 loc = self._info_axis.get_loc(key) 3575 except KeyError:

/opt/conda/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2896 except KeyError as err: -> 2897 raise KeyError(key) from err 2898

KeyError: 'type'

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)

in 1 user_events.rete.plot_graph(norm_type=None, 2 weight_col=None, ----> 3 thresh=250) /opt/conda/lib/python3.7/site-packages/retentioneering/core/core_functions/plot_graph.py in plot_graph(self, targets, weight_col, norm_type, layout_dump, width, height, thresh) 102 width=width, 103 height=height, --> 104 thresh=thresh) 105 106 # if work from google colab user HTML display: /opt/conda/lib/python3.7/site-packages/retentioneering/visualization/plot_utils.py in save_plot_wrapper(*args, **kwargs) 19 sns.mpl.pyplot.show() 20 sns.mpl.pyplot.close() ---> 21 res = func(*args, **kwargs) 22 if len(res) == 2: 23 (vis_object, name), res, cfg = res, None, None /opt/conda/lib/python3.7/site-packages/retentioneering/visualization/draw_graph.py in graph(data, node_params, thresh, width, height, interactive, layout_dump, show_percent, plot_name, node_weights, **kwargs) 227 height=round(height - height / 3), 228 node_weights=node_weights, --> 229 **kwargs) 230 231 res['node_params'] = node_params /opt/conda/lib/python3.7/site-packages/retentioneering/visualization/draw_graph.py in _make_json_data(data, node_params, layout_dump, thresh, width, height, **kwargs) 102 data["type"] = data.apply( 103 lambda x: node_params.get(x.source) if node_params.get(x.source) == 'source' else node_params.get( --> 104 x.target) or 'suit', 1) 105 106 pos, degrees = _calc_layout(data, node_params, width=width, height=height, **kwargs) /opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in __setitem__(self, key, value) 3038 else: 3039 # set column -> 3040 self._set_item(key, value) 3041 3042 def _setitem_slice(self, key: slice, value): /opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in _set_item(self, key, value) 3115 self._ensure_valid_index(value) 3116 value = self._sanitize_column(key, value) -> 3117 NDFrame._set_item(self, key, value) 3118 3119 # check if we are modifying a copy /opt/conda/lib/python3.7/site-packages/pandas/core/generic.py in _set_item(self, key, value) 3575 except KeyError: 3576 # This item wasn't present, just insert at end -> 3577 self._mgr.insert(len(self._info_axis), key, value) 3578 return 3579 /opt/conda/lib/python3.7/site-packages/pandas/core/internals/managers.py in insert(self, loc, item, value, allow_duplicates) 1187 value = _safe_reshape(value, (1,) + value.shape) 1188 -> 1189 block = make_block(values=value, ndim=self.ndim, placement=slice(loc, loc + 1)) 1190 1191 for blkno, count in _fast_count_smallints(self.blknos[loc:]): /opt/conda/lib/python3.7/site-packages/pandas/core/internals/blocks.py in make_block(values, placement, klass, ndim, dtype) 2720 values = DatetimeArray._simple_new(values, dtype=dtype) 2721 -> 2722 return klass(values, ndim=ndim, placement=placement) 2723 2724 /opt/conda/lib/python3.7/site-packages/pandas/core/internals/blocks.py in __init__(self, values, placement, ndim) 2376 values = np.array(values, dtype=object) 2377 -> 2378 super().__init__(values, ndim=ndim, placement=placement) 2379 2380 @property /opt/conda/lib/python3.7/site-packages/pandas/core/internals/blocks.py in __init__(self, values, placement, ndim) 129 if self._validate_ndim and self.ndim and len(self.mgr_locs) != len(self.values): 130 raise ValueError( --> 131 f"Wrong number of items passed {len(self.values)}, " 132 f"placement implies {len(self.mgr_locs)}" 133 ) ValueError: Wrong number of items passed 3, placement implies 1 ` Code snapshot ` user_events.dtypes user_id int64 event object timestamp object dtype: object # raises the exception user_events.rete.plot_graph(norm_type=None, weight_col=None, thresh=250) ` Any tips on this? Thanks
abelstam12 commented 4 years ago

Btw, using the plot_graph without arguments seems to work, so the thresh is actually breaking my example. Ill dive into the documentation to find out why,

Thanks

tokedo commented 3 years ago

Hi, sorry for the late response! Can you try to plot the graph without specifying the thresh argument and then adjust the threshold in the js interface using sliding bars on the right? Does it working for you without traceback? We'll try to investigate why passing thresh could crash the plot_graph function.

abelstam12 commented 3 years ago

@tokedo Worked without threshold. I think there were no edges in my example crossing the threshold, resulting in no input. Maybe this can help identify the issue.

Thanks Abel

ChernyshovAnton commented 1 year ago

The commentary answered the original question.