I2PC / scipion

Scipion is an image processing framework to obtain 3D models of macromolecular complexes using Electron Microscopy (3DEM)
http://scipion.i2pc.es
Other
76 stars 47 forks source link

Particle extraction failures (empty no-dim output, list index out of range) #2073

Closed mmaldo520 closed 4 years ago

mmaldo520 commented 4 years ago

Hello, I'm having trouble extracting particles on Scipion 2.0 and was hoping you may have some pointers.

I imported .box coordinates from crYOLO using scipion-import coordinates. I've also used scipion-assign ctf to assign ctf from no-dose-weighted micrographs to a subset of dose-weighted micrographs, from which I want to extract the particles.

When I use relion-particles extraction, the protocol fails with "list index out of range". STARTED: extractMicrographListStep, step 1 00015: 2020-01-14 10:39:11,180 INFO: 2020-01-14 10:39:11.180152 00016: 2020-01-14 10:39:11,266 ERROR: Protocol failed: list index out of range 00017: 2020-01-14 10:39:11,360 INFO: FAILED: extractMicrographListStep, step 1

When I use xmipp3-extractparticles, the protocol finishes, but the output is empty: Output None, SetofParticles (0 items, no-dim, 0.83 A/px)

I've also noticed that the stdout file for the scipion-import coordinates gives a warning for each box WARNING: Error parsing coordinate file, skipping this file, even though the protocol finishes seemingly without problems and the output is a set of 557,391 coordinates.

Why are the extraction protocols failing/producing empty sets? Is the problem with the extraction, or does it originate with my coordinate import or ctf assignment? Sorry if I'm missing something very obvious. I've searched the forums/tutorials for similar situations and error messages and couldn't find relevant matches.

Thank you very much for your help! Maria

azazellochg commented 4 years ago

Hello Maria, Could you please attach the run.stdout file from relion here?

mmaldo520 commented 4 years ago

Hi Grigory,

Here are two stdout files from 2 similar runs. In one case the imported coordinates were .star and in the other they were .box (both imported with the scipion import coordinates).

Thank you for your help, Maria

From: Grigory Sharov notifications@github.com Reply-To: I2PC/scipion reply@reply.github.com Date: Tuesday, January 14, 2020 at 6:03 PM To: I2PC/scipion scipion@noreply.github.com Cc: Maria Maldonado mmaldo@ucdavis.edu, Author author@noreply.github.com Subject: Re: [I2PC/scipion] Particle extraction failures (empty no-dim output, list index out of range) (#2073)

Hello Maria, Could you please attach the run.stdout file from relion here?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/I2PC/scipion/issues/2073?email_source=notifications&email_token=AN75YS4YPAMEBJYVIVAT2S3Q5ZVGFA5CNFSM4KG2LFJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI6ZHDY#issuecomment-574460815, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AN75YS4ON7MJ7A76JUJWKADQ5ZVGFANCNFSM4KG2LFJA.

azazellochg commented 4 years ago

Hi,

Can't seem to find any attachments.. You can send them directly to sharov.grigory@gmail.com and I'll have a look tomorrow.

Grigory.

On Wed, Jan 15, 2020, 02:22 mmaldo520 notifications@github.com wrote:

Hi Grigory,

Here are two stdout files from 2 similar runs. In one case the imported coordinates were .star and in the other they were .box (both imported with the scipion import coordinates).

Thank you for your help, Maria

From: Grigory Sharov notifications@github.com Reply-To: I2PC/scipion reply@reply.github.com Date: Tuesday, January 14, 2020 at 6:03 PM To: I2PC/scipion scipion@noreply.github.com Cc: Maria Maldonado mmaldo@ucdavis.edu, Author < author@noreply.github.com> Subject: Re: [I2PC/scipion] Particle extraction failures (empty no-dim output, list index out of range) (#2073)

Hello Maria, Could you please attach the run.stdout file from relion here?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub< https://github.com/I2PC/scipion/issues/2073?email_source=notifications&email_token=AN75YS4YPAMEBJYVIVAT2S3Q5ZVGFA5CNFSM4KG2LFJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI6ZHDY#issuecomment-574460815>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/AN75YS4ON7MJ7A76JUJWKADQ5ZVGFANCNFSM4KG2LFJA>.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/I2PC/scipion/issues/2073?email_source=notifications&email_token=ABVBPJTJ22DVNCHY3MG4TE3Q5ZXPDA5CNFSM4KG2LFJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI62I2A#issuecomment-574465128, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVBPJS3TI6WMSDGT2OJV63Q5ZXPDANCNFSM4KG2LFJA .

azazellochg commented 4 years ago

Here is the error msg:

Runs/003625_ProtRelionExtractParticles/logs @ unicorn (mmaldo)\
| => cat run.stdout\
RUNNING PROTOCOL -----------------\
     HostName: unicorn.mcb.ucdavis.edu\
          PID: 2284\
      Scipion: v2.0 (2019-04-23) Diocletian\
   currentDir: /media/raid/lettslab/EM_processing/mmaldo/\
\
   workingDir: Runs/003625_ProtRelionExtractParticles\
      runMode: Continue\
          MPI: 2\
      threads: 1\
len(steps) 2 len(prevSteps) 0\
 Starting at step: 1\
 Running steps\
STARTED: extractMicrographListStep, step 1\
  2020-01-14 18:18:04.491226\
Traceback (most recent call last):\
  File "/programs/x86_64-linux/scipion/2.0/pyworkflow/protocol/protocol.py", line 186, in run\
    self._run()\
  File "/programs/x86_64-linux/scipion/2.0/pyworkflow/protocol/protocol.py", line 237, in _run\
    resultFiles = self._runFunc()\
  File "/programs/x86_64-linux/scipion/2.0/pyworkflow/protocol/protocol.py", line 233, in _runFunc\
    return self._func(*self._args)\
  File "/programs/x86_64-linux/scipion/2.0/pyworkflow/em/protocol/protocol_particles.py", line 271, in extractMicrographListStep\
    self._extractMicrographList(micList, *args)\
  File "/programs/x86_64-linux/scipion/2.0/software/lib/python2.7/site-packages/relion/protocols/protocol_extract_particles.py", line 152, in _extractMicrographList\
    micsStar = self._getMicsStar(micList)\
  File "/programs/x86_64-linux/scipion/2.0/software/lib/python2.7/site-packages/relion/protocols/protocol_extract_particles.py", line 468, in _getMicsStar\
    return 'micrographs_%05d-%05d.star' % (micList[0].getObjId(),\
IndexError: list index out of range\
Protocol failed: list index out of range\
FAILED: extractMicrographListStep, step 1\
  2020-01-14 18:18:04.541535\
*** Last status is failed\
------------------- PROTOCOL FAILED (DONE 1/2)
azazellochg commented 4 years ago

@mmaldo520 I was able to reproduce the error, so I will let you know once I know how to fix it.

delarosatrevin commented 4 years ago

Thanks @mmaldo520 and @azazellochg. At least is good that it can be reproduced. @azazellochg let me know if I can be of any help.

azazellochg commented 4 years ago

First, the problem is that here: https://github.com/I2PC/scipion/blob/master/pyworkflow/em/protocol/protocol_particles.py#L382 micDict contains "20170629_00022_frameImage_aligned_mic_DW.mrc' as key, while ctfDict contains "20170629_00022_frameImage.tiff" as key (from ctf.getMicrograph().getMicName()), so they do not match.

azazellochg commented 4 years ago

I'm puzzled how did it work for years?

delarosatrevin commented 4 years ago

I'm not sure about the details now...but maybe this is only a problem for some imports? Where the micName does not match?

azazellochg commented 4 years ago

20170629_00022_frameImage_aligned_mic_DW.mrc != 20170629_00022_frameImage.tiff I'm using "other mics" option in extract protocol

azazellochg commented 4 years ago

Alright, I see the problem now. Usually, the pipeline will work if you follow all the steps from motioncor and downwards to extraction, because micName is not changed after all protocols. Now, the DW mics were RE-imported so they have different micName with "blalaDW suffix" so they do not match ctf micname which is still old.

azazellochg commented 4 years ago

@mmaldo520 as a quick fix you can make a new subset from DW mics (your subset 2) instead of re-improting the micrographs. That should solve the problem of extraction.

mmaldo520 commented 4 years ago

Thank you very much, Grigory!

I tried extracting after making the new subset, but it failed again seemingly the same way with both Relion and Xmipp extractions. However, I think it may have to do with the way I made this new subset? I used the scipion subset tool (intersection) with full set: original DW micrographs, second input set: the imported set from cryolo. Does this intersection take the names from the full set or the second set? Where is the list of micrographs in the subset kept? Could I correct the list file on the terminal? Or do you have another suggestion as to how to make this subset?

Thanks again, Maria

azazellochg commented 4 years ago

@mmaldo520 hm, looks more complex than I thought. Can you manually make a subset from original DW set by opening it (analyze results) and selecting good ones?

mmaldo520 commented 4 years ago

It's a 9600 micrograph full set and a 8300 subset that then was picked with cryolo. Doing it manually would be quite intensive. Could I do it at the level of the files with the lists and then compute the difference and remove the extra lines? I could do this on the command line, but where are these micrograph lists stored in scipion? Thank you.

azazellochg commented 4 years ago

I think the easy solution would be to use all 9600 mics in extraction protocol, if the coordinates are missing for bad ones - they will not be extracted.

mmaldo520 commented 4 years ago

Great, thanks for the suggestion! I'll try that

mmaldo520 commented 4 years ago

Unfortunately, that also failed with the same error message. Does that mean there's an error with the imported coordinates too?

In the meantime, I did the particle extraction in Relion outside of Scipion and imported those particles back into Scipion and started a CL2D. This is working so far. However, do you think these particles will have issues in subsequent steps, in which case I should stop the CL2D and solve the current problems first? Or should I assume that subsequent steps will be fine and forget about these current problems? Thanks for your advice.

image

azazellochg commented 4 years ago

Unless you plan to do polishing (which will require raw movies) all the steps should be ok.

mmaldo520 commented 4 years ago

Thank you very much for all your help, Grigory! I think I'll carry on with what is working for now.

azazellochg commented 4 years ago

@delarosatrevin @pconesa I think we can close this. The original problem was https://github.com/I2PC/scipion/issues/2073#issuecomment-575169403