angelolab / ark-analysis

Integrated pipeline for multiplexed image analysis
https://ark-analysis.readthedocs.io/en/latest/
MIT License
70 stars 25 forks source link

2_Pixie_Cluster_Pixel.ipynb not creating pixel_meta_cluster_mapping.csv #1086

Closed milesbailey121 closed 9 months ago

milesbailey121 commented 9 months ago

The bug occurs due to the 2_Pixie_Cluster_Pixel.ipynb not creating the file but it does not throw an error and the code proceeds until the file is needed later for pixel_meta_clustering.apply_pixel_meta_cluster_remapping().

A few days ago the code was running correctly and producing the needed file output. I haven't changed any code except the filepaths: `_define the home directory (should contain pixel_outputdir from pixel clustering notebook) base_dir = "data_processing"

define the name of the folder containing the pixel cluster data pixel_output_dir = '_pixel_output_dir'

define the name of the cell clustering params file cell_clustering_params_name = 'cell_clustering_params.json'

define the base output cell folder cell_output_dir = '%s_cell_output_dir' % cell_cluster_prefix if not os.path.exists(os.path.join(base_dir, "pixie", cell_output_dir)): os.mkdir(os.path.join(base_dir, "pixie", cell_output_dir))

define the paths to cell clustering files, explicitly set the variables to use custom names cell_som_weights_name = os.path.join("pixie", cell_output_dir, 'cell_som_weights.feather') cluster_counts_name = os.path.join("pixie", cell_output_dir, 'cluster_counts.feather') cluster_counts_size_norm_name = os.path.join("pixie", cell_output_dir, 'cluster_counts_size_norm.feather') weighted_cell_channel_name = os.path.join("pixie", cell_output_dir, 'weighted_cell_channel.feather') cell_som_cluster_count_avg_name = os.path.join("pixie", cell_output_dir, 'cell_som_cluster_count_avg.csv') cell_meta_cluster_count_avg_name = os.path.join("pixie", cell_output_dir, 'cell_meta_cluster_count_avg.csv') cell_som_cluster_channel_avg_name = os.path.join("pixie", cell_output_dir, 'cell_som_cluster_channel_avg.csv') cell_meta_cluster_channel_avg_name = os.path.join("pixie", cell_output_dir, 'cell_meta_cluster_channel_avg.csv') cell_meta_cluster_remap_name = os.path.join("pixie", cell_output_dir, 'cell_meta_cluster_mapping.csv')

Here is the code that throws an error because pixel_meta_cluster_mapping.csv is not created.

rename the meta cluster values in the pixel dataset

pixel_meta_clustering.apply_pixel_meta_cluster_remapping( fovs, channels, base_dir, pixel_data_dir, pixel_meta_cluster_remap_name, multiprocess=multiprocess, batch_size=batch_size )

recompute the mean channel expression per meta cluster and apply these new names to the SOM cluster average data

pixel_meta_clustering.generate_remap_avg_files( fovs, channels, base_dir, pixel_data_dir, pixel_meta_cluster_remap_name, pc_chan_avg_som_cluster_name, pc_chan_avg_meta_cluster_name )

Here's the error output: c:\Users\miles\GitHub\ark-analysis-implementation\2_Pixie_Cluster_Pixels.ipynb Cell 44 line 2 1 # rename the meta cluster values in the pixel dataset ----> 2 pixel_meta_clustering.apply_pixel_meta_cluster_remapping( 3 fovs, 4 channels, 5 base_dir, 6 pixel_data_dir, 7 pixel_meta_cluster_remap_name, 8 multiprocess=multiprocess, 9 batch_size=batch_size 10 ) 12 # recompute the mean channel expression per meta cluster and apply these new names to the SOM cluster average data 13 pixel_meta_clustering.generate_remap_avg_files( 14 fovs, 15 channels, (...) 20 pc_chan_avg_meta_cluster_name 21 )

File c:\Users\miles\GitHub\ark-analysis-implementation.venv\lib\site-packages\ark\phenotyping\pixel_meta_clustering.py:363, in apply_pixel_meta_cluster_remapping(fovs, channels, base_dir, pixel_data_dir, pixel_remapped_name, multiprocess, batch_size) 360 pixel_remapped_path = os.path.join(base_dir, pixel_remapped_name) 362 # file path validation --> 363 io_utils.validate_paths([pixel_data_path, pixel_remapped_path]) ... ---> 38 raise FileNotFoundError( 39 f"The file/path, {pathlib.Path(path).name}, could not be found..." 40 )

FileNotFoundError: The file/path, pixel_meta_cluster_mapping.csv, could not be found...