fidelity / seq2pat

[AAAI 2022] Seq2Pat: Sequence-to-Pattern Generation Library
https://fidelity.github.io/seq2pat/
GNU General Public License v2.0
120 stars 14 forks source link

Failed to use Attributes to add Constraints #53

Closed TheZL closed 4 months ago

TheZL commented 5 months ago

Dear authors,

I am trying to use seq2pat to find patterns in about 30,000 clickstream sequences. The maximum length of the sequences is 400, while the average length is about 300. Each action is represented by an integer. At the same time, I assigned a score for each action in the sequences. The score is a number between 0 and 1, e.g., [0.9997021565900573, 0.9467735896573791, 0.8021896491513977]. In the script, I try to constrain the average scores of the patterns to be identified. Here is my script: score = Attribute(list_of_scores) seq2pat = Seq2Pat(sequences) constranted_seq = seq2pat.add_constraint(score.average() <= 0.3) patterns = seq2pat.get_patterns(min_frequency=5)

The scripts works and returns a list of patterns. However, the issue is that when I changed constranted_seq from "score.average() <= 0.3" to "score.average() >= 0.7", I got the exactly same patterns. The constraints seems not working for my data.

Do you have any suggestions on how to solve this issue? Thanks!

skadio commented 5 months ago

2 quick comments;

TheZL commented 5 months ago

Thanks for the quick reply!

1) For the script, I was following this example: https://github.com/fidelity/seq2pat/blob/master/notebooks/sequential_pattern_mining.ipynb

**# Constraint to specify the average price of patterns avg_constraint = seq2pat.add_constraint(3 <= price.average() <= 4)

Find patterns that occur at least twice within average price range

results = seq2pat.get_patterns(min_frequency=2) print(results)**

2) Let me try to use integers for the attributes instead.