tbepler / topaz

Pipeline for particle picking in cryo-electron microscopy images using convolutional neural networks trained from positive and unlabeled examples. Also featuring micrograph and tomogram denoising with DNNs.
GNU General Public License v3.0
169 stars 63 forks source link

error when using topaz convert #60

Closed olibclarke closed 3 years ago

olibclarke commented 4 years ago

Hi Tristan,

I would like to use topaz convert to convert a particle star file to .box files (I want to do a direct comparison with crYOLO using same training data).

When I run the following command:

topaz convert --from star --to box extracted_particles.star --output boxfiles/

I get this error:

  File "/home/user/software/miniconda2/envs/topaz/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 2646, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'image_name'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/software/miniconda2/envs/topaz/bin/topaz", line 11, in <module>
    load_entry_point('topaz-em==0.2.3', 'console_scripts', 'topaz')()
  File "/home/user/software/miniconda2/envs/topaz/lib/python3.6/site-packages/topaz/main.py", line 146, in main
    args.func(args)
  File "/home/user/software/miniconda2/envs/topaz/lib/python3.6/site-packages/topaz/commands/convert.py", line 165, in main
    coords = file_utils.read_coordinates(path, format=from_forms[i])
  File "/home/user/software/miniconda2/envs/topaz/lib/python3.6/site-packages/topaz/utils/files.py", line 158, in read_coordinates
    table['image_name'] = table['image_name'].apply(strip_ext)
  File "/home/user/software/miniconda2/envs/topaz/lib/python3.6/site-packages/pandas/core/frame.py", line 2800, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/home/user/software/miniconda2/envs/topaz/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 2648, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'image_name'

This is using Topaz 0.2.4. Here are the first few lines of the input star, converted from cryosparc using csparc2star.py from the pyem package:



loop_
_rlnVoltage #1
_rlnSphericalAberration #2
_rlnAmplitudeContrast #3
_rlnOpticsGroup #4
_rlnImagePixelSize #5
_rlnImageDimensionality #6
300.000000 0.001000 0.100000 0 5.500000 2

data_particles

loop_
_rlnImageName #1
_rlnMicrographName #2
_rlnCoordinateX #3
_rlnCoordinateY #4
_rlnDefocusU #5
_rlnDefocusV #6
_rlnDefocusAngle #7
_rlnPhaseShift #8
_rlnOpticsGroup #9
000001@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 1113 1858 32539.451172 26328.779297 268.115906 0.000000 0
000002@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 1870 1743 32699.949219 26489.277344 268.115906 0.000000 0
000003@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 1051 436 32476.630859 26265.958984 268.115906 0.000000 0
000004@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 644 2095 32523.177734 26312.505859 268.115906 0.000000 0
000005@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 2731 320 32772.179688 26561.505859 268.115906 0.000000 0
000006@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 1767 2652 32519.345703 26308.673828 268.115906 0.000000 0
000007@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 1664 794 32617.072266 26406.400391 268.115906 0.000000 0
000008@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 1309 2327 32522.851562 26312.179688 268.115906 0.000000 0
000009@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 933 1053 32516.515625 26305.843750 268.115906 0.000000 0
000010@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 2911 1842 32656.023438 26445.351562 268.115906 0.000000 0~~~

Does topaz need to be able to locate the particle mrc file for this to work?
tbepler commented 4 years ago

I'm not able to reproduce this error when I run topaz convert with the file you posted above. Can you share a minimal star file that gives the error?

As a side note, you probably want to set the --boxsize parameter (it is set to 0 by default) for writing the .box files. You'll also need to make a 'full_data/' directory within your 'boxfile/' directory to avoid getting an error. At the moment, topaz will not automatically create those subdirectories for you. I'm adding that to the list of things to add for the next version.

olibclarke commented 4 years ago

Hi Tristan,

In my hands the section that I posted does give the error (including when I have a full_data directory located within boxfile and --bozsize set)

tbepler commented 4 years ago

Can you confirm that you are using topaz version 0.2.4? In your error, I see "topaz-em==0.2.3" which makes me think you may be using v0.2.3 instead.

olibclarke commented 4 years ago

topaz --version reports 0.2.4

image
tbepler commented 4 years ago

How did you install topaz?

olibclarke commented 4 years ago

Using conda

tbepler commented 4 years ago

I just installed topaz from conda into a fresh environment and reran your command with your file and received no errors. Here's my environment info, file, and command for reference.

Create environment and install topaz:

conda create -n topaz-debug python=3
source activate topaz-debug
conda install topaz -c tbepler -c pytorch

image

conda list
# packages in environment at /data/cb/tbepler/miniconda2/envs/topaz-debug:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main
blas                      1.0                         mkl
ca-certificates           2020.6.24                     0
certifi                   2020.6.20                py38_0
cudatoolkit               10.2.89              hfd86e86_1
freetype                  2.10.2               h5ab3b9f_0
future                    0.18.2                   py38_1
intel-openmp              2020.1                      217
joblib                    0.16.0                     py_0
jpeg                      9b                   h024ee3a_2
lcms2                     2.11                 h396b838_0
ld_impl_linux-64          2.33.1               h53a641e_7
libedit                   3.1.20191231         h14c3975_1
libffi                    3.3                  he6710b0_2
libgcc-ng                 9.1.0                hdf63c60_0
libgfortran-ng            7.3.0                hdf63c60_0
libpng                    1.6.37               hbc83047_0
libstdcxx-ng              9.1.0                hdf63c60_0
libtiff                   4.1.0                h2733197_1
lz4-c                     1.9.2                he6710b0_1
mkl                       2020.1                      217
mkl-service               2.3.0            py38he904b0f_0
mkl_fft                   1.1.0            py38h23d657b_0
mkl_random                1.1.1            py38h0573a6f_0
ncurses                   6.2                  he6710b0_1
ninja                     1.10.0           py38hfd86e86_0
numpy                     1.19.1           py38hbc911f0_0
numpy-base                1.19.1           py38hfa32c7d_0
olefile                   0.46                       py_0
openssl                   1.1.1g               h7b6447c_0
pandas                    1.0.5            py38h0573a6f_0
pillow                    7.2.0            py38hb39fc2d_0
pip                       20.1.1                   py38_1
python                    3.8.3                hcff3b4d_2
python-dateutil           2.8.1                      py_0
pytorch                   1.6.0           py3.8_cuda10.2.89_cudnn7.6.5_0    pytorch
pytz                      2020.1                     py_0
readline                  8.0                  h7b6447c_0
scikit-learn              0.23.1           py38h423224d_0
scipy                     1.5.0            py38h0b6359f_0
setuptools                49.2.0                   py38_0
six                       1.15.0                     py_0
sqlite                    3.32.3               h62c20be_0
threadpoolctl             2.1.0              pyh5ca1d4c_0
tk                        8.6.10               hbc83047_0
topaz                     0.2.4                      py_0    tbepler
torchvision               0.7.0                py38_cu102    pytorch
wheel                     0.34.2                   py38_0
xz                        5.2.5                h7b6447c_0
zlib                      1.2.11               h7b6447c_3
zstd                      1.4.5                h9ceee32_0

Then, I have your file named "error_star.star":

loop_
_rlnVoltage #1
_rlnSphericalAberration #2
_rlnAmplitudeContrast #3
_rlnOpticsGroup #4
_rlnImagePixelSize #5
_rlnImageDimensionality #6
300.000000 0.001000 0.100000 0 5.500000 2

data_particles

loop_
_rlnImageName #1
_rlnMicrographName #2
_rlnCoordinateX #3
_rlnCoordinateY #4
_rlnDefocusU #5
_rlnDefocusV #6
_rlnDefocusAngle #7
_rlnPhaseShift #8
_rlnOpticsGroup #9
000001@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 1113 1858 32539.451172 26328.779297 268.115906 0.000000 0
000002@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 1870 1743 32699.949219 26489.277344 268.115906 0.000000 0
000003@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 1051 436 32476.630859 26265.958984 268.115906 0.000000 0
000004@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 644 2095 32523.177734 26312.505859 268.115906 0.000000 0
000005@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 2731 320 32772.179688 26561.505859 268.115906 0.000000 0
000006@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 1767 2652 32519.345703 26308.673828 268.115906 0.000000 0
000007@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 1664 794 32617.072266 26406.400391 268.115906 0.000000 0
000008@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 1309 2327 32522.851562 26312.179688 268.115906 0.000000 0
000009@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 933 1053 32516.515625 26305.843750 268.115906 0.000000 0
000010@J968/extract/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted_particles.mrc full_data/19may15d_00018sq_v03_00003hln_00008enn.frames_patch_aligned_doseweighted.mrc 2911 1842 32656.023438 26445.351562 268.115906 0.000000 0

I create a boxfile directory:

mkdir error_box
mkdir error_box/full_data

and run topaz convert:

topaz convert --from star --to box error_star.star --output error_box/

Here is the output: image image

Without being able to reproduce your error, there's not much more I can do to help. If you try rerunning my steps above, do you still get the error?

olibclarke commented 4 years ago

This is exactly how I installed topaz - I will try re-installing but in the meantime here is what I see in my env:

image
tbepler commented 4 years ago

Any updates on this, Oli? Did reinstalling fix the problem?

olibclarke commented 4 years ago

Hi Tristan - I haven't tried re-installing completely yet, but I will let you know when I do if it fixes the issue - thanks!

tbepler commented 3 years ago

I'm going to close this issue, since there haven't been any updates for a while. If this problem recurs, feel free to reopen.