The model selection fails with an IndexError for the London Tube data set.
Minimal code to reproduce the error:
import pathpyG as pp
paths_tube = pp.PathData.from_ngram('../data/tube_paths_train.ngram', sep=',', weight=True)
m = pp.MultiOrderModel.from_PathData(paths_tube, max_order=2)
m.estimate_order(paths_tube, max_order=2, significance_threshold=0.01)
I suspect that this has to do with the use of append_walks in the from_ngram function, which already concatenates the Data objects in PathData. The model selection code seems to assume that all paths are stored in individual data objects.
The model selection fails with an IndexError for the London Tube data set.
Minimal code to reproduce the error:
I suspect that this has to do with the use of
append_walks
in thefrom_ngram
function, which already concatenates theData
objects inPathData
. The model selection code seems to assume that all paths are stored in individual data objects.