Closed arilwan closed 1 year ago
Thanks for the suggestions! Actually, the good news is that this is already possible via Example 11 here: https://rasbt.github.io/mlxtend/user_guide/feature_selection/SequentialFeatureSelector/#example-11-interrupting-long-runs-for-intermediate-results
But please feel free to reopen this in case it doesn't work or doesn't fully solve the problem.
@rasbt
Very sorry to reopen this issue again, I understand from the example you mentioned, Intermidiate Results are accessible upon process Interruption.
What I hope do to is retrieve those attributes (no of features selected, & metric score) saved to a variable (or write to a file) after adding every feature in an SFFS, without interrupting.
For example, I started running the selection process below on 2 June 2023.
[2023-06-14 08:01:25] Features: 149/240 -- score: 0.8947770129386831[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done 25 tasks | elapsed: 21.0min
[Parallel(n_jobs=-1)]: Done 91 out of 91 | elapsed: 60.4min finished
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done 25 tasks | elapsed: 20.8min
[Parallel(n_jobs=-1)]: Done 149 out of 149 | elapsed: 100.3min finished
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done 25 tasks | elapsed: 21.4min
[Parallel(n_jobs=-1)]: Done 148 out of 148 | elapsed: 89.4min finished
[2023-06-14 12:11:29] Features: 149/240 -- score: 0.8952890526770254[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done 25 tasks | elapsed: 18.2min
[Parallel(n_jobs=-1)]: Done 91 out of 91 | elapsed: 53.0min finished
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done 25 tasks | elapsed: 18.5min
For 2 weeks now, and still way to go, maybe another 2 weeks.
Suppose those attributes as accessible, and say saved to a file, I can do some anoalysis of the results after Features: 50/240, Features: 100/240, Features: 148/240
etc. without actually interrupting the running process.
Isn't there any way to write those to a file?
@rasbt Can you please guide me what section of the code should I change to continuously write the attributes values to a txt file that I can keep updating after adding every feature?
For future reference, linking the discussion here: #1051
Owning to the fact that Sequential Feature Selection is really a time-consuming preprocessing task.
Wouldn't it be nice to have some way to access immediate features selected while the algorithm keeps running. So for example using
SFFS
with say 100 features to select the best, would be nice at round N, to somehow retrieve feature subset selected at end of the selection round.