Training on my own data

Thank you so much for your kind words and interest in our work!

To address your questions:

Yes, we recommend an image size of 256 x 256 as this is the dimension we conducted all experiments. However, you might experiment with different sizes depending on the specifics of your data. To generate different image sizes, please modify the following preprocess codes: https://github.com/BoomStarcuc/GeneSegNet/blob/d8ce4302fc5da880b6322e2d0ac821ad7fa58f5b/preprocess/Generate_Image_Label_locationMap.py#L167-L176
We primarily used nuclei staining in our experiments as we aimed to predict cell boundaries using nuclei staining images combined with gene expressions as input. Although not explicitly tested in our study, our model is adaptable and can process various types of images, such as cell border staining or a combination of nuclei and cell border staining images. It's essential, however, that the image adheres to a standard image format. Furthermore, if you opt for different types of inputs, you might need to adjust the 'chan' parameter within the network to match the number of your input channels. For more details, please refer to the "chan" parameter in the “Training from scratch” section.
Please see the Data preprocess section to process your own data. Yes, initially all you need as input data are gene expressions (Spot x, Spot y, and gene), staining image, and training annotations. Here, if you do not have the gene info, please ignore the info since the gene info only is used on RNA-based methods, such as Baysor and JSTA.
For the datasets we used in our paper, the annotations in the Hippocampus and NSCLC datasets are downloaded from their official website and the annotations in the simulation dataset are generated by a toolbox (see link: https://cbia.fi.muni.cz/simulator/). If you want to generate annotations in your real data, I have followed several recommendations: (1) Completely manual annotation. While it is time-consuming, manual annotation can get good results. (2) Combination of pre-trained models of existing cell segmentation algorithms and manual correction. First, run all your images on the pre-trained model, and then manually correct incorrect cell segmentations.

BoomStarcuc / GeneSegNet

Training on my own data #2