WayScience / CytoSnake

Orchestrating high-dimensional cell morphology data processing pipelines
https://cytosnake.readthedocs.io
Creative Commons Attribution 4.0 International
3 stars 3 forks source link

Remove iteration process in `feature_select` #89

Open axiomcura opened 11 months ago

axiomcura commented 11 months ago

Looking at this code in feature_select.py code location here

# iteratively passing normalized data
    for norm_data, feature_file_out in io_files:
        feature_selection(
            normalized_profile=norm_data,
            out_file=feature_file_out,
            config=config_path,
            log_file=log_path,
        )

In addition, the feature_select.smk module also takes in a list of profiles, therefore removing snakemake parallelization here

It would be best to remove the for loop and take advantage of snakemake's parallelization.

axiomcura commented 8 months ago

Update: This is related to #41. Using a single input for the feature_select module is causing issues. Currently, the script requires a list of file paths, which is removes CytoSnake's multiprocessing (handled by Snakemake). However, we want users to have the option to do feature selection across all plates or choose specific features for each plate while maintaining flexibility.

This will be added into the next release project roadmap.