data from preprint - Githubissues

acycliq commented 3 years ago

Hi

Thanks for making this interesting extension available. I would like to run the dataset from Willis et al. you are mentioning in your preprint, so I could familiarize myself and then run my own data from mouse brain (I guess I will also have to train the network). I have downloaded the zipped file from https://www.repository.cam.ac.uk/handle/1810/262530 . However I cannot understand which are the 125 3D stacks you are writing about at the beginning of section 3 of the preprint. If you could also share the manually annotated 3D confocal stack of A. thaliana, that will be greatly appreciated.

Best regards

DEschweiler commented 3 years ago

Hi

thanks for your interest in our work. From the Willis data set we used all files marked with acylYFP. We now added csv file lists to /dataloader, which hopefully explain which files were used. Please note that these lists already point to the transformed hdf5 files, but the file names match the corresponding tiff files. We also uploaded the validation stack, which is available at https://gigamove.rwth-aachen.de/de/download/a0858aefa11a6a61c0bb152d69fcc972. Please let us know if there are any remaining problems with the data sets.

Best Regards

acycliq commented 3 years ago

Thanks so much for your reply. I think I have prepared my h5 files.

I have also made the calculate_flows from h5_converter.py a bit faster, on my pc 2+ times faster for this operation (quad-core CPU with hyperthreading hence cpu_count() = 8)

from multiprocessing import Pool, cpu_count
from functools import partial

def calculate_flows_par(instance_mask, bg_label=0):
    flow_x = np.zeros(instance_mask.shape, dtype=np.float32)
    flow_y = np.zeros(instance_mask.shape, dtype=np.float32)
    flow_z = np.zeros(instance_mask.shape, dtype=np.float32)
    regions = measure.regionprops(instance_mask)

    # Get number of cores and split labels across that many workers
    processes = cpu_count()

    print_timestamp(f'Using {processes} processes')

    # Chunk up the regions across the processes
    chunks = np.array_split(regions, processes)

    # Map the regions across the processes
    with Pool(processes=processes) as pool:
        result = pool.map(partial(worker, flow_xyz=[flow_x, flow_y, flow_z], instance_mask=instance_mask, bg_label=bg_label), chunks)

    # summation below assumes that masks do not overlap, ie any given pixel has one label only and only one.
    flow_x = sum([d[0] for d in result]) 
    flow_y = sum([d[1] for d in result])
    flow_z = sum([d[2] for d in result])
    return flow_x, flow_y, flow_z

def worker(regions, flow_xyz, instance_mask, bg_label):
    flow_x = flow_xyz[0]
    flow_y = flow_xyz[1]
    flow_z = flow_xyz[2]
    for props in regions:
        if props.label == bg_label:
            continue

        c = props.centroid
        coords = np.where(instance_mask == props.label)

        flow_x[coords] = np.tanh((coords[0] - c[0]) / 5)
        flow_y[coords] = np.tanh((coords[1] - c[1]) / 5)
        flow_z[coords] = np.tanh((coords[2] - c[2]) / 5)

    return flow_x, flow_y, flow_z

DEschweiler commented 2 years ago

Glad to hear that it works now. Thank you very much for the improvements, we will certainly consider this in our next update!

acycliq commented 2 years ago

Hi again

Sorry if that sounds quite basic, but once you have trained the network, how do you proceed to segment a new, never seen before, image. I have run the train_network.py code and I would like now to use the trained network to segment another image. There must be some logic that saves the trained network somehow and then applies it to a new input user-data but I havent found it

DEschweiler commented 2 years ago

Hi, the training script (train_network.py) saves the model to the directory provided with the --output_path parameter. To apply the model to unseen data, there is another dedicated script (apply_network.py), which will use the saved model provided with the --ckpt_path parameter and the data provided with the test list. After obtaining the predicted foreground and flow maps from the network, you can use the apply_cellpose.py script to reconstruct the final instance segmentations by providing the path to the output folder with --filedir.

stegmaierj / Cellpose3D

data from preprint #3