tbepler / topaz

Pipeline for particle picking in cryo-electron microscopy images using convolutional neural networks trained from positive and unlabeled examples. Also featuring micrograph and tomogram denoising with DNNs.
GNU General Public License v3.0
171 stars 63 forks source link

Preparing training data #185

Closed ojasvijain closed 8 months ago

ojasvijain commented 8 months ago

Hi,

I wanted some clarification on creating the training data. my current format is as follows:

raw/
     particles.txt
     images/
          a.mrc
          b.mrc
          c.mrc

my particles.txt (tab separated) file looks like:

image_name     x_coord     y_coord
a.mrc     345    123
b.mrc     234    344
c.mrc     566     987

I am running my train as follows:

topaz train --train-images <path>/raw/images/ \
            --train-targets <path>/raw/particles.txt \
            --save-prefix=saved_models/model \
            -o saved_models/model_tranining.txt \
            -n 400 --num-workers=8 --no-pretrained --image-ext .mrc

I am getting the following error:

WARNING: 5846 micrographs listed in the coordinates file are missing from the training images. Image names are listed below.
# Loaded 5846 training micrographs with 0 labeled particles
ERROR: no training particles specified. Check that micrograph names in the particles file match those in the micrographs file/directory.
Traceback (most recent call last):
  File "/vast/projects/miti2324/envs/topaz_env/bin/topaz", line 33, in <module>
    sys.exit(load_entry_point('topaz-em==0.2.5', 'console_scripts', 'topaz')())
  File "/vast/projects/miti2324/envs/topaz_env/lib/python3.6/site-packages/topaz/main.py", line 148, in main
    args.func(args)
  File "/vast/projects/miti2324/envs/topaz_env/lib/python3.6/site-packages/topaz/commands/train.py", line 641, in main
    image_ext=args.image_ext
  File "/vast/projects/miti2324/envs/topaz_env/lib/python3.6/site-packages/topaz/commands/train.py", line 272, in load_data
    raise Exception('No training particles.')
Exception: No training particles.

Can you tell me if what I'm doing is correct?

DarnellGranberry commented 8 months ago

Can you create a version of particles.txt where the filenames don't contain the extension and give that a try? Also, it won't cause issues with the training, but your output file has a typo.

ojasvijain commented 8 months ago

It ran! Thanks a lot :) & also, silly typo in the output file - but it still worked