Closed xehu closed 3 months ago
@xehu This looks quite straight-forward. I was reading Helena's branch on task 3 and I think she already addressed this issue. Basically, she labeled each feature as chat level or conversation level and feed them accordingly in feature builder. In terms of dependencies, are we talking about sub-features required for a feature, or required packages to be installed?
Yup, I think Helena’s branch, if merged, essentially addresses this task! Right now her branches touches on this AND the next task (allowing users to choose features — it has the start of giving people arguments they can pass in for features they want to include or exclude).By dependencies, I mean that some features require specific preprocessing steps (for example, word vectors) that need to be run before the feature is computed; so, in that sense, it’s the sub-features required for a feature (requirements.txt handles the packages we need installed, and in my mind, it’s a little less important to track exactly which feature uses exactly which package). Having a sense of what those steps are is required so that we can skip preprocessing steps should the user not ask for any of the features that depend on the step.If you look at the channel with Helena, she’s shared a doc (and I’ve added some content) where we’ve basically decided focus on the two big preprocessing steps (word vectors and sentiment), and to run all the lexical features by default (since it’s cheap and easy to do). However, we are thinking that we’ll default to NOT running the word vectors (generating self.vect_data) because it takes a long time to initially preprocess, and we can save some time if the user doesn’t need the feature. This, we’ll want to track which features depend on these steps and run the preprocessing steps (and the features) depending on those choices.On Jul 14, 2024, at 11:54 PM, yuxuanzh @.***> wrote: @xehu This looks quite straight-forward. I was reading Helena's branch on task 3 and I think she already addressed this issue. Basically, she labeled each feature as chat level or conversation level and feed them accordingly in feature builder. In terms of dependencies, are we talking about sub-features required for a feature, or required packages to be installed?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
Yup, I think Helena’s branch, if merged, essentially addresses this task! Right now her branches touches on this AND the next task (allowing users to choose features — it has the start of giving people arguments they can pass in for features they want to include or exclude).By dependencies, I mean that some features require specific preprocessing steps (for example, word vectors) that need to be run before the feature is computed; so, in that sense, it’s the sub-features required for a feature (requirements.txt handles the packages we need installed, and in my mind, it’s a little less important to track exactly which feature uses exactly which package). Having a sense of what those steps are is required so that we can skip preprocessing steps should the user not ask for any of the features that depend on the step.If you look at the channel with Helena, she’s shared a doc (and I’ve added some content) where we’ve basically decided focus on the two big preprocessing steps (word vectors and sentiment), and to run all the lexical features by default (since it’s cheap and easy to do). However, we are thinking that we’ll default to NOT running the word vectors (generating self.vect_data) because it takes a long time to initially preprocess, and we can save some time if the user doesn’t need the feature. This means we’ll want to track which features depend on these steps and run the preprocessing steps (and the features) depending on those choices.On Jul 14, 2024, at 11:54 PM, yuxuanzh @.***> wrote: @xehu This looks quite straight-forward. I was reading Helena's branch on task 3 and I think she already addressed this issue. Basically, she labeled each feature as chat level or conversation level and feed them accordingly in feature builder. In terms of dependencies, are we talking about sub-features required for a feature, or required packages to be installed?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
This Issue has two key goals:
Getting Started
utils/calculate_chat_level_features.py
(https://github.com/Watts-Lab/team-process-map/blob/main/feature_engine/utils/calculate_chat_level_features.py). In this file, leverage the user-selected feature list and the dependencies to generate the features as efficiently as possible. Right now, since we generate all features by default, we simply go through and call each feature one at a time. Can we do more in this file to track which dependencies are needed, and call only the features the user wants?