Closed mezwick closed 1 month ago
Hi @mezwick, this error indicates that the image files aren't located in the correct place. As noted in the error message, it expects the images to be located at "I:\example_dataset\image_data." Since you manually downloaded the example dataset, make sure that you move them to the correct place.
Based on what you provided, your directory structure should look something like this:
I:\example_dataset\image_data
│
├── fov0
│ ├── CD3.tiff
│ ├── CD4.tiff
│ ├── CD8.tiff
│ ├── ...
├── fov1
│ ├── CD3.tiff
│ ├── CD4.tiff
│ ├── CD8.tiff
│ ├── ...
├── ...
Alternatively, you can also change the path at tiff_dir
in the notebook to point to the location of the images.
Hi.
Thanks for getting back to me :).
I can confirm that the directory structure reflects this and the images are contained in that directory.
I am running ark_env
environment, setup with the environment.yml
file cloned from the repo. The only changes i have made to the notebook are to specifiy
base_dir = r'C:\example_dataset'
And to set segmentation_dir
to None
, as i do not have segmentation masks and am only interested in running the pixel clustering bit of the pipeline.
segmentation_dir = None
I have also run
os.path.exists(base_dir)
os.path.exists(tiff_dir)
to confirm the directories exist, both return True
.
Nevertheless, when i run
# run pixel data preprocessing
pixie_preprocessing.create_pixel_matrix(
fovs,
channels,
base_dir,
tiff_dir,
pixie_seg_dir,
img_sub_folder=img_sub_folder,
seg_suffix=seg_suffix,
pixel_output_dir=pixel_output_dir,
data_dir=pixel_data_dir,
subset_dir=pixel_subset_dir,
norm_vals_name_post_rownorm=norm_vals_name,
blur_factor=blur_factor,
subset_proportion=subset_proportion,
multiprocess=multiprocess,
batch_size=batch_size
)
This still returns the error
ValueError: No images found in designated folder, C:\example_dataset\image_data\fov0
But, if i list the contents of that folder from python, i do find the images in that directory
test_path = r'C:\example_dataset\image_data\fov0'
os.listdir(test_path)
Returns
['CD14.tiff',
'CD163.tiff',
'CD20.tiff',
'CD3.tiff',
'CD31.tiff',
'CD4.tiff',
'CD45.tiff',
'CD68.tiff',
'CD8.tiff',
'CK17.tiff',
'Collagen1.tiff',
'ECAD.tiff',
'ECAD_smoothed.tiff',
'Fibronectin.tiff',
'GLUT1.tiff',
'H3K27me3.tiff',
'H3K9ac.tiff',
'HLADR.tiff',
'IDO.tiff',
'Ki67.tiff',
'PD1.tiff',
'SMA.tiff',
'Vim.tiff']
I should say, the above example was running from a Windows work station.
But i have also now tested from a linux workstation. Again, creating the env from the repo cloned environment.yml
file. Additionally, this time i downloaded the example data via the code cell in the example notebook.
In this case, the error returned also not being able to find the designated images, but it got to fov4
. Complete error copied below.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[16], line 2
1 # run pixel data preprocessing
----> 2 pixie_preprocessing.create_pixel_matrix(
3 fovs,
4 channels,
5 base_dir,
6 tiff_dir,
7 pixie_seg_dir,
8 img_sub_folder=img_sub_folder,
9 seg_suffix=seg_suffix,
10 pixel_output_dir=pixel_output_dir,
11 data_dir=pixel_data_dir,
12 subset_dir=pixel_subset_dir,
13 norm_vals_name_post_rownorm=norm_vals_name,
14 blur_factor=blur_factor,
15 subset_proportion=subset_proportion,
16 multiprocess=multiprocess,
17 batch_size=batch_size
18 )
File ~/anaconda3/envs/ark_env/lib/python3.10/site-packages/ark/phenotyping/pixie_preprocessing.py:345, in create_pixel_matrix(fovs, channels, base_dir, tiff_dir, seg_dir, img_sub_folder, seg_suffix, pixel_output_dir, data_dir, subset_dir, norm_vals_name_pre_rownorm, norm_vals_name_post_rownorm, pixel_thresh_name, channel_percentile_pre_rownorm, channel_percentile_post_rownorm, is_mibitiff, blur_factor, subset_proportion, seed, multiprocess, batch_size)
342 # load existing channel_norm_pre_path if exists, otherwise generate
343 if not os.path.exists(channel_norm_pre_rownorm_path):
344 # compute channel percentiles
--> 345 channel_norm_pre_rownorm_df = pixel_cluster_utils.calculate_channel_percentiles(
346 tiff_dir=tiff_dir,
347 fovs=fovs,
348 channels=channels,
349 img_sub_folder=img_sub_folder,
350 percentile=channel_percentile_pre_rownorm
351 )
352 # save output
353 feather.write_dataframe(
354 channel_norm_pre_rownorm_df, channel_norm_pre_rownorm_path, compression='uncompressed'
355 )
File ~/anaconda3/envs/ark_env/lib/python3.10/site-packages/ark/phenotyping/pixel_cluster_utils.py:45, in calculate_channel_percentiles(tiff_dir, fovs, channels, img_sub_folder, percentile)
42 percentile_list = []
43 for fov in fovs:
44 # load image data and remove 0 valued pixels
---> 45 img = load_utils.load_imgs_from_tree(data_dir=tiff_dir, img_sub_folder=img_sub_folder,
46 channels=[channel], fovs=[fov]).values[0, :, :, 0]
47 img = img[img > 0]
49 # record and store percentile, skip if no non-zero pixels
File ~/anaconda3/envs/ark_env/lib/python3.10/site-packages/alpineer/load_utils.py:166, in load_imgs_from_tree(data_dir, img_sub_folder, fovs, channels, max_image_size)
163 channels = [chan for _, chan in sorted(zip(channels_indices, all_channels))]
165 if len(channels) == 0:
--> 166 raise ValueError(f"No images found in designated folder, {os.path.join(data_dir, fovs[0])}")
168 test_img = io.imread(os.path.join(data_dir, fovs[0], img_sub_folder, channels[0]))
170 # The dtype is always the type of the image being loaded in.
ValueError: No images found in designated folder, ../../../data/example_dataset/image_data/fov4
Again, i have checked and the fov directories are loaded with the images. Checked it with the following code just to be sure...
# Specify the directory path
directory_path = ('/').join([tiff_dir, 'image_data'])
# Function to check for .tiff files in each subdirectory
def check_tiff_in_subdirs(directory_path):
for subdir, dirs, files in os.walk(directory_path):
# Check if there is any .tiff file in the current subdir
if not any(file.endswith('.tiff') or file.endswith('.tif') for file in files):
# If no .tiff files are found in the current subdir, return False
return False
# If all subdirs have at least one .tiff file, return True
return True
# Function to find directories without .tiff files
def find_dirs_without_tiff(directory_path):
dirs_without_tiff = []
for subdir, dirs, files in os.walk(directory_path):
# Check if there is any .tiff file in the current subdir
if not any(file.endswith('.tiff') or file.endswith('.tif') for file in files):
# If no .tiff files are found in the current subdir, add it to the list
dirs_without_tiff.append(subdir)
return dirs_without_tiff
# Call the function and print the result
result = check_tiff_in_subdirs(directory_path)
print(f"Every subdirectory contains a .tiff file: {result}")
# Call the function and store the result
directories_without_tiff = find_dirs_without_tiff(directory_path)
# Print the list of directories without .tiff files
print("Directories without .tiff files:")
for directory in directories_without_tiff:
print(directory)
Which returns
Every subdirectory contains a .tiff file: True
Directories without .tiff files:
Ah! i have solved the issue.
It is because i was not calculating the nuclear image specified as channel CD163_nuc_exclude
.
Now i have removed this from the channels
list, all appears to run :).
Sorry for the hassle!
Glad you worked it out! This could be helpful for future users who run into the same error, so thanks!
Please refer to our FAQ and look at our known issues before opening a bug report.
Describe the bug Running the
2_Pixie_Cluster_Pixels.pynb
on the example data downloaded to the example data directory, i return an error in thepixie_preprocessing.create_pixel_matrix()
which explains thatExpected behavior I expected the images which i can see in the the named directory would be found and the pixel matrix created
To Reproduce I did not edit the exaple notebook beyond specifying the base directory.
I manually downloaded the example_dataset from hugging face
I set up the environment via conda with the environment.yml file i cloned from the ark-analysis repository.