Open HusamAQ opened 2 years ago
Thanks Husam -- glad you're finding it useful!
Sure, I think there should be a way to do it. By calibration set, do you mean pruning set, validation set, probability calibration, or something else?
On Thu, Apr 7, 2022 at 4:58 AM HusamAQ @.***> wrote:
Hi @imoscovitz https://github.com/imoscovitz, thank you for this amazing package!
I want to use it for my thesis and was wondering if there is a way to get the sets that the rules were made on? So for each rule I would have the training/calibration set for this rule?
— Reply to this email directly, view it on GitHub https://github.com/imoscovitz/wittgenstein/issues/21, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGMVL5TIHMDDJTFU4SCKAGTVD3EUTANCNFSM5SZD7PYQ . You are receiving this because you were mentioned.Message ID: @.***>
Thanks for your reply!
Yes, I am trying to get the training (growset & pruning) for each rule in the final model
Gotcha. To train a ruleset, training each successive rule uses training and pruning data based on all prior rules that have already been trained as part of the ruleset. So, for example, the data used to create rule 4 is based on rules 1-3 have been trained. My understanding is you want the train/prune data for 1, train/prune data for 2 based on 1, train/prune data for 3 based on 1-2, etc.?
On Sat, Apr 9, 2022 at 12:08 PM HusamAQ @.***> wrote:
Thanks for your reply!
Yes, I am trying to get the training (growset & pruning) for each rule in the final model
— Reply to this email directly, view it on GitHub https://github.com/imoscovitz/wittgenstein/issues/21#issuecomment-1094108703, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGMVL5XQB7G2D4KTXE2TIT3VEHITLANCNFSM5SZD7PYQ . You are receiving this because you were mentioned.Message ID: @.***>
Yes! that is exactly what I am trying to get
There is a verbosity parameter that you can use when you declare your IREP
or RIPPER model. You can set verbosity=5
, which will give you a bunch of
training information that may also be useful to you, but not the actual
examples. To get the examples, I'd suggest making a couple of small changes
to the code. (You should be able to do this by cloning the repo and
importing it locally, or cloning and installing it in editable mode with
pip install -e <directory of package that has setup.py>
)
The changes you want to make are these: In base_functions.py
, there are
two functions, grow_rule_cn
and prune_rule_cn
that are called each time
a rule is added. They each take as parameters pos_idx
and neg_idx
,
which represent the indices of your dataset that are being used for
training/pruning that rule. (pos_idx
is for the positive class examples,
neg_idx
for the negatives.) At the beginning of each of these two
functions, you can add a couple of lines of code to print, write to a file,
or however you want to keep track of the datasets.
Let me know how that works for you or if you have any questions!
On Sat, Apr 9, 2022 at 2:35 PM HusamAQ @.***> wrote:
Yes! that is exactly what I am trying to get
— Reply to this email directly, view it on GitHub https://github.com/imoscovitz/wittgenstein/issues/21#issuecomment-1094127900, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGMVL5RWLYC52V2DND7PSTTVEHZ2LANCNFSM5SZD7PYQ . You are receiving this because you were mentioned.Message ID: @.***>
Oh, and do you want each of the training and pruning datasets to be separated, or do you only need them together as train+prune dataset?
On Sat, Apr 9, 2022 at 1:38 PM Ilan Moscovitz @.***> wrote:
Gotcha. To train a ruleset, training each successive rule uses training and pruning data based on all prior rules that have already been trained as part of the ruleset. So, for example, the data used to create rule 4 is based on rules 1-3 have been trained. My understanding is you want the train/prune data for 1, train/prune data for 2 based on 1, train/prune data for 3 based on 1-2, etc.?
On Sat, Apr 9, 2022 at 12:08 PM HusamAQ @.***> wrote:
Thanks for your reply!
Yes, I am trying to get the training (growset & pruning) for each rule in the final model
— Reply to this email directly, view it on GitHub https://github.com/imoscovitz/wittgenstein/issues/21#issuecomment-1094108703, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGMVL5XQB7G2D4KTXE2TIT3VEHITLANCNFSM5SZD7PYQ . You are receiving this because you were mentioned.Message ID: @.***>
Hi @imoscovitz, thank you for this amazing package!
I want to use it for my thesis and was wondering if there is a way to get the sets that the rules were made on? So for each rule I would have the training/calibration set for this rule?