gitter-lab / t-cell-classification

Jupyter notebooks demonstrating a microscopy machine learning image analysis workflow
BSD 3-Clause Clear License
6 stars 2 forks source link

How can i use a large dataset for this algorithm? #11

Closed skradha26 closed 4 years ago

skradha26 commented 4 years ago

Hi,

i am trying to use a larger dataset. I am actually using deep learning application and benchmarking some file systems to know which one is better for deep learning. It is part of a course in my PhD program. Can you please help me find a larger dataset ( > 500 MB < 2 GB) to use in this algorithm. I am not a machine learning expert. I am a beginner to machine learning. Please advise.

Thanks in advance for help!

agitter commented 4 years ago

You should find our Jupyter notebooks helpful examples that you could use for your benchmarking, though you may want to convert the code from notebooks to Python scripts if you are going to do extensive benchmarking. However, the T cell images we used for this project may not fit your needs. Each individual image is small, and there is not a huge number of images.

One source of public imaging datasets is the Broad Bioimage Benchmark Collection. They may have datasets that fit your specification, and you could test our code with classification problems involving those datasets.

In addition, a new Nature Methods special issue is all about deep learning in microscopy. Those articles, especially the two Analysis articles, may have open datasets.

For general advice in this area, I highly recommend the image.sc forum. Imaging experts from different software platforms are active there.

agitter commented 4 years ago

I'm closing this issue. Please follow up if you have more questions about our code or see the above suggestions for general image analysis resources.