BodenmillerGroup / imctools

Tools to handle IMC data
https://bodenmillergroup.github.io/imctools
MIT License
22 stars 16 forks source link

Unable to parse zipped mcd and txt #41

Closed zamlerd closed 5 years ago

zamlerd commented 5 years ago

Hello @votti ,

I am trying to get the pipeline up and running for our IMC acquisitions but am hitting a wall when trying to convert the zipped mcds into an acceptable input format.

I have zipped the mcd and the region associate .txt files but cannot seem to figure out where things are going wrong, the cell runs fine with the example data

Thanks in advance,

smaffiol commented 5 years ago

Hi,

could you please be more specific on what pipeline are you using, at what step it breaks and what error message are you getting ? additionally, how did you setup imctools ? it helps us debugging the issue

thanks and regards, Sergio

Il giorno gio 20 dic 2018 alle ore 00:01 zamlerd notifications@github.com ha scritto:

Hello @votti https://github.com/votti ,

I am trying to get the pipeline up and running for our IMC acquisitions but am hitting a wall when trying to convert the zipped mcds into an acceptable input format.

I have zipped the mcd and the region associate .txt files but cannot seem to figure out where things are going wrong, the cell runs fine with the example data

Thanks in advance,

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/BodenmillerGroup/imctools/issues/41, or mute the thread https://github.com/notifications/unsubscribe-auth/ADte4L_UgNQAl3javEm0AC9fkB7Tx_T7ks5u6sVmgaJpZM4ZbLJF .

votti commented 5 years ago

Hi there, Sorry for the inconvenience! I just updated the example notebook in https://github.com/BodenmillerGroup/ImcSegmentationPipeline/tree/development/scripts to not silently ignore errors but print the error message instead. Could you run this updated notebook and report the error that gets produced?

Thanks!

zamlerd commented 5 years ago

Thanks for the prompt responses @smaffiol @votti ,

I am using the ImcSegmentationPipeline-development,

I am running on a windows 10 system,

I setup imctools using the steps provided in the jupyter notebook under the cell the IMC preprocessing pipeline for multiplexed image analysis

The error occurs converting the zipped IMC acquisitions to input format,

I suspect something is wrong with the affiliated .txt files but I tried exporting a fresh batch from MCDViewer and checked hat they were similarly formatted to the example datasets

Here is the error, ERROR:root:Error in ../../IMPx4/Human\1197344-13-2.zip Traceback (most recent call last): File "", line 11, in File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\scripts\convertfolder2imcfolder.py", line 57, in convert_folder2imcfolder imc_fol.write_imc_folder(zipfolder=dozip) File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcfolderwriter.py", line 68, in write_imc_folder self.mcd.save_slideimage(sid, out_folder) File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\mcdparserbase.py", line 284, in save_slideimage with open(os.path.join(out_folder, fn_out), 'wb') as f: OSError: [Errno 22] Invalid argument: '/Users/dzamler/Documents/OMEtiffsLogs\ometiff\1197344-13\1197344-13_s1_slide""'

Please don't hesitate to let me know if you need anything else,

Cheers,

votti commented 5 years ago

Cool! That helps a lot. @.txt files: in the end they are just used as a backup in vase if the mcd is corrupted - something that would happen often back in the days. They are optional nowadays - I will try to reflect thia better in the description.

It seems like an issue with how a file path is constructed in a non Windows conform way - I will need to investigate a bit more and et you know if I need more infos.

votti commented 5 years ago

Were any files generated in the folder: /Users/dzamler/Documents/OMEtiffsLogs\ometiff\1197344-13 at all?

zamlerd commented 5 years ago

@votti I thought that might be a problem originally as well, but it works with the example dataset, unless zipping the files on windows caused some sort of difference

here is the contents of the folder Directory of C:\Users\dzamler\Documents\OMEtiffsLogs\ometiff\1197344-13

12/20/2018 11:01 AM

. 12/20/2018 11:01 AM .. 12/20/2018 11:03 AM 1,868 1197344-13_AcquisitionChannel_meta.csv 12/20/2018 11:03 AM 78 1197344-13_AcquisitionROI_meta.csv 12/20/2018 11:03 AM 1,337 1197344-13_Acquisition_meta.csv 12/20/2018 11:03 AM 808 1197344-13_Panorama_meta.csv 12/20/2018 11:03 AM 763 1197344-13_ROIPoint_meta.csv 12/20/2018 11:03 AM 289,397,586 1197344-13_s1_p1_r1_a1_ac.ome.tiff 12/20/2018 11:03 AM 292,097,970 1197344-13_s1_p2_r2_a2_ac.ome.tiff 12/20/2018 11:03 AM 292,579,506 1197344-13_s1_p3_r3_a3_ac.ome.tiff 12/20/2018 11:03 AM 35,301 1197344-13_schema.xml 12/20/2018 11:03 AM 260 1197344-13_Slide_meta.csv 10 File(s) 874,115,477 bytes 2 Dir(s) 342,671,175,680 bytes free

votti commented 5 years ago

Strange that it only gives this invalid path error for the slide image and not anything before... Especially as it seems that the path is constructed identical every time.

Do the ome.tiff generated look fine?

Otherwise all that doesn't work so far is extracting the extra images (slide overview image, panorama, before & after acquisition image) which are not used for segmentation anyways and should thus not influence the downstream processing.

On Thu, Dec 20, 2018, 10:39 PM zamlerd <notifications@github.com wrote:

@votti https://github.com/votti I thought that might be a problem originally as well, but it works with the example dataset, unless zipping the files on windows caused some sort of difference

here is the contents of the folder Directory of C:\Users\dzamler\Documents\OMEtiffsLogs\ometiff\1197344-13

12/20/2018 11:01 AM . 12/20/2018 11:01 AM .. 12/20/2018 11:03 AM 1,868 1197344-13_AcquisitionChannel_meta.csv 12/20/2018 11:03 AM 78 1197344-13_AcquisitionROI_meta.csv 12/20/2018 11:03 AM 1,337 1197344-13_Acquisition_meta.csv 12/20/2018 11:03 AM 808 1197344-13_Panorama_meta.csv 12/20/2018 11:03 AM 763 1197344-13_ROIPoint_meta.csv 12/20/2018 11:03 AM 289,397,586 1197344-13_s1_p1_r1_a1_ac.ome.tiff 12/20/2018 11:03 AM 292,097,970 1197344-13_s1_p2_r2_a2_ac.ome.tiff 12/20/2018 11:03 AM 292,579,506 1197344-13_s1_p3_r3_a3_ac.ome.tiff 12/20/2018 11:03 AM 35,301 1197344-13_schema.xml 12/20/2018 11:03 AM 260 1197344-13_Slide_meta.csv 10 File(s) 874,115,477 bytes 2 Dir(s) 342,671,175,680 bytes free

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BodenmillerGroup/imctools/issues/41#issuecomment-449143921, or mute the thread https://github.com/notifications/unsubscribe-auth/ADkdKCLVesVYFenAcuNgY4dBRpDuCwGOks5u7AOEgaJpZM4ZbLJF .

votti commented 5 years ago

Ah I think I see what the problem is: Somehow the file ending of the encoded slide image filename is parsed wrongly to be "" . To better understand why this is: could you copy-paste what the entry ImageFile in the 1197344-13_Slide_meta.csv file is? (in the C:\Users\dzamler\Documents\OMEtiffsLogs\ometiff\1197344-13 folder)

zamlerd commented 5 years ago

Some of the ome.tiff files channels look okay but some appear to be inverted as far as black and white.

ImageFile =

""

votti commented 5 years ago

@ImageFile: I think that is the reasons for the SlideImage failing - the instead of having a real empty filename the Mcd contains a "" as the filename - this can be easily fixed.

Some of the ome.tiff files channels look okay but some appear to be inverted as far as black and white.

If true that would be quite worrisome - what kind of image viewer do you use to check if they are valid? I really would recommend using ImageJ/Fiji to look at the images.

If somehow possible at this point it would be likely easiest you could give me access to an example zipped .mcd file, such that I could check if there is something unusual there.

zamlerd commented 5 years ago

should I just change the name in the meta .csv? I have no problem providing access what would be the best way?

votti commented 5 years ago

Changing the name in the 'meta.csv' would not change anything. It really needs a quick fix in the code. But as said, in the all the relevant parts of the data are already parsed at the time of the error, so this does anyway only affect some metadata images that are likely not so relevant for analysis.

For sharing, please share the zipped .mcd data as well as a zipped version of your ometiff output folder by uploading them to something like Dropbox/Google Drive and send me the link per email. I can also provide you a link to a secure platform from our University to upload if you prefer that. My email is vito.zanotelli@uzh.ch Cheers!

zamlerd commented 5 years ago

Hey Vito,

Sorry for the delayed response, Happy new year and Holidays to you!

I am just waiting on the okoay from my PI to share the files over, We're meeting on Thursday so I should get back to you by then at the latest.

In the meantime I will keep playing with the pipeline as it is,

All the best,

Cheers!

On Fri, Dec 21, 2018 at 4:48 PM Vito Zanotelli notifications@github.com wrote:

Changing the name in the 'meta.csv' would not change anything. It really needs a quick fix in the code. But as said, in the all the relevant parts of the data are already parsed at the time of the error, so this does anyway only affect some metadata images that are likely not so relevant for analysis.

For sharing, please share the zipped .mcd data as well as a zipped version of your ometiff output folder by uploading them to something like Dropbox/Google Drive and send me the link per email. I can also provide you a link to a secure platform from our University to upload if you prefer that. My email is vito.zanotelli@uzh.ch Cheers!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BodenmillerGroup/imctools/issues/41#issuecomment-449517450, or mute the thread https://github.com/notifications/unsubscribe-auth/AVlMzPF3SjJ6PaMRfHGS-59f8QIxE-lbks5u7WVMgaJpZM4ZbLJF .

-- Sincerely, D. B. Zamler

zamlerd commented 5 years ago

Hey Vito,

Moving forward I got a new error when generating the analysis stacks, again only on the experimental set, test data is running fine,

ERROR:root:Error in 1197344-13_s1_p1_r1_a1_ac.ome.tiff Traceback (most recent call last): File "", line 13, in File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\scripts\ometiff2analysis.py", line 38, in ometiff_2_analysis writer = imc_img.get_image_writer(outname + '.tiff', metals=selmetals, mass=selmass) File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisition.py", line 39, in get_image_writer order = self.get_metal_indices(metals) File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisitionbase.py", line 91, in get_metal_indices return [order_dict[m] for m in metallist] File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisitionbase.py", line 91, in return [order_dict[m] for m in metallist] KeyError: 'Nd148' ERROR:root:Error in 1197344-13_s1_p1_r1_a1_ac.ome.tiff Traceback (most recent call last): File "", line 13, in File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\scripts\ometiff2analysis.py", line 38, in ometiff_2_analysis writer = imc_img.get_image_writer(outname + '.tiff', metals=selmetals, mass=selmass) File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisition.py", line 39, in get_image_writer order = self.get_metal_indices(metals) File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisitionbase.py", line 91, in get_metal_indices return [order_dict[m] for m in metallist] File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisitionbase.py", line 91, in return [order_dict[m] for m in metallist] KeyError: 'Ru100' ERROR:root:Error in 1197344-13_s1_p2_r2_a2_ac.ome.tiff Traceback (most recent call last): File "", line 13, in File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\scripts\ometiff2analysis.py", line 38, in ometiff_2_analysis writer = imc_img.get_image_writer(outname + '.tiff', metals=selmetals, mass=selmass) File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisition.py", line 39, in get_image_writer order = self.get_metal_indices(metals) File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisitionbase.py", line 91, in get_metal_indices return [order_dict[m] for m in metallist] File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisitionbase.py", line 91, in return [order_dict[m] for m in metallist] KeyError: 'Nd148' ERROR:root:Error in 1197344-13_s1_p2_r2_a2_ac.ome.tiff Traceback (most recent call last): File "", line 13, in File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\scripts\ometiff2analysis.py", line 38, in ometiff_2_analysis writer = imc_img.get_image_writer(outname + '.tiff', metals=selmetals, mass=selmass) File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisition.py", line 39, in get_image_writer order = self.get_metal_indices(metals) File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisitionbase.py", line 91, in get_metal_indices return [order_dict[m] for m in metallist] File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisitionbase.py", line 91, in return [order_dict[m] for m in metallist] KeyError: 'Ru100' ERROR:root:Error in 1197344-13_s1_p3_r3_a3_ac.ome.tiff Traceback (most recent call last): File "", line 13, in File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\scripts\ometiff2analysis.py", line 38, in ometiff_2_analysis writer = imc_img.get_image_writer(outname + '.tiff', metals=selmetals, mass=selmass) File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisition.py", line 39, in get_image_writer order = self.get_metal_indices(metals) File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisitionbase.py", line 91, in get_metal_indices return [order_dict[m] for m in metallist] File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisitionbase.py", line 91, in return [order_dict[m] for m in metallist] KeyError: 'Nd148' ERROR:root:Error in 1197344-13_s1_p3_r3_a3_ac.ome.tiff Traceback (most recent call last): File "", line 13, in File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\scripts\ometiff2analysis.py", line 38, in ometiff_2_analysis writer = imc_img.get_image_writer(outname + '.tiff', metals=selmetals, mass=selmass) File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisition.py", line 39, in get_image_writer order = self.get_metal_indices(metals) File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisitionbase.py", line 91, in get_metal_indices return [order_dict[m] for m in metallist] File "C:\Users\dzamler\AppData\Local\Continuum\anaconda2\envs\imctools\lib\site-packages\imctools\io\imcacquisitionbase.py", line 91, in return [order_dict[m] for m in metallist] KeyError: 'Ru100'

On Fri, Dec 21, 2018 at 4:48 PM Vito Zanotelli notifications@github.com wrote:

Changing the name in the 'meta.csv' would not change anything. It really needs a quick fix in the code. But as said, in the all the relevant parts of the data are already parsed at the time of the error, so this does anyway only affect some metadata images that are likely not so relevant for analysis.

For sharing, please share the zipped .mcd data as well as a zipped version of your ometiff output folder by uploading them to something like Dropbox/Google Drive and send me the link per email. I can also provide you a link to a secure platform from our University to upload if you prefer that. My email is vito.zanotelli@uzh.ch Cheers!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BodenmillerGroup/imctools/issues/41#issuecomment-449517450, or mute the thread https://github.com/notifications/unsubscribe-auth/AVlMzPF3SjJ6PaMRfHGS-59f8QIxE-lbks5u7WVMgaJpZM4ZbLJF .

-- Sincerely, D. B. Zamler

votti commented 5 years ago

Hi! It is important that all the channels that are in the pannel.csv and selected for a analysis stack (e.g. either full or ilastik) were actually present in your acquisition. According to the error, some of your acquisitions do not contain measurements for Ru100 or Nd148, which causes these errors. Thus you need to adapt your pannel.csv to fit the actual acquisitions that you did. Does that make sense?

I will note down that these error messages should be improved.

zamlerd commented 5 years ago

Yes it does!

I'll work on adapting the pannel.csv now and see if I can get it up. Can I just put 0 in both full and ilastik or should I remove the channels from the csv?

Cheers,

On Tue, Jan 8, 2019 at 10:29 AM Vito Zanotelli notifications@github.com wrote:

Hi! It is important that all the channels that are in the pannel.csv and selected for a analysis stack (e.g. either full or ilastik) were actually present in your acquisition. According to the error, some of your acquisitions do not contain measurements for Ru100 or Nd148, which causes these errors. Thus you need to adapt your pannel.csv to fit the actual acquisitions that you did. Does that make sense?

I will note down that these error messages should be improved.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BodenmillerGroup/imctools/issues/41#issuecomment-452362308, or mute the thread https://github.com/notifications/unsubscribe-auth/AVlMzP5mTjFP0uL2Q7wuI6isW1Ied1esks5vBMd0gaJpZM4ZbLJF .

-- Sincerely, D. B. Zamler

votti commented 5 years ago

Hey there! I reallized that currently there is no real expaination of the pannel.csv - I thus updated the example scripts with comments (in the section where the pannel.csv is first used): https://github.com/BodenmillerGroup/ImcSegmentationPipeline/blob/development/scripts/imc_preprocessing.ipynb

I hope that helps!

In the end the pannel.csv should really reflect the channels/antibodies that were used for the acquisition - thus I would rather remove channels you did actually not measure.

zamlerd commented 5 years ago

Awesome thank you!

I'm gonna give things a run today and will let you know how it goes.

Cheers mate

On Wed, Jan 9, 2019 at 4:21 AM Vito Zanotelli notifications@github.com wrote:

Hey there! I reallized that currently there is no real expaination of the pannel.csv

  • I thus updated the example scripts with comments (in the section where the pannel.csv is first used):

https://github.com/BodenmillerGroup/ImcSegmentationPipeline/blob/development/scripts/imc_preprocessing.ipynb

I hope that helps!

In the end the pannel.csv should really reflect the channels/antibodies that were used for the acquisition - thus I would rather remove channels you did actually not measure.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BodenmillerGroup/imctools/issues/41#issuecomment-452645643, or mute the thread https://github.com/notifications/unsubscribe-auth/AVlMzDCf2PS_rbVFFUmvPryN4wWkP8Jtks5vBcKNgaJpZM4ZbLJF .

-- Sincerely, D. B. Zamler

votti commented 5 years ago

I assume everything worked?

zamlerd commented 5 years ago

Hey Vito!

Thanks for checking in,

Yes we were able to get everything set-up and running, Training took forever but we got some good segmentation.

I am still rather hung up on the histocat portion if you have any tips/advice but I also have my candidacy exam for my Ph.D. on 4/24 and with this on top of a machine learning and data algorithms class have me quite underwater.

I will get in touch as soon as I have some data/ more questions

P.S. I met Professor Hartland Jackson at an IMC conference here at Rice university and had a great chat, ke spoke highly of you.

All the best,

On Wed, Apr 3, 2019 at 3:18 AM Vito Zanotelli notifications@github.com wrote:

I assume everything worked?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BodenmillerGroup/imctools/issues/41#issuecomment-479388006, or mute the thread https://github.com/notifications/unsubscribe-auth/AVlMzCDIrgXhB4swXTX-s_pRaWP1tYZPks5vdGPKgaJpZM4ZbLJF .

-- Sincerely, D. B. Zamler

veenstje commented 3 years ago

Hi @votti

I'm having a similar problem using our own data with the panel.csv file. Even after adjusting the panel file for our own antibody panel we encounter the below error. I'm not sure why we would be getting the 'Ru102' and 'Ru100' key errors occuring after these were deleted from your base panel. I did verify that we included all the appropriate channels associated with the MCD file.

ru error

panel

Thanks

plankter commented 3 years ago

Hi @veenstje,

could you check please imctools version that is installed?

Best regards, Anton

votti commented 3 years ago

Hi there,

This is either: a) you didn't adapt the 'file_path_csv_panel' to the new panel (the one you have printed below not containing Ru100) or b) you have some more rows below in the csv containing Ru100.

So to debug:

Best, Vito

On Mon, 17 May 2021 at 18:49, veenstje @.***> wrote:

Hi @votti https://github.com/votti

I'm having a similar problem using our own data with the panel.csv file. Even after adjusting the panel file for our own antibody panel we encounter the below error. I'm not sure why we would be getting the 'Ru102' and 'Ru100' key errors occuring after these were deleted from your base panel. I did verify that we included all the appropriate channels associated with the MCD file.

[image: ru error] https://user-images.githubusercontent.com/83611550/118526153-f2bf1900-b70d-11eb-9179-9827b64566da.PNG

[image: panel] https://user-images.githubusercontent.com/83611550/118526160-f5ba0980-b70d-11eb-9090-d2a06866524d.PNG

Thanks

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BodenmillerGroup/imctools/issues/41#issuecomment-842477275, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4R2KCAN3KCTKUAN3BVR63TOFCIRANCNFSM4GLMWJCQ .

veenstje commented 3 years ago

Thank you both Anton and Vito for your quick responses. After further investigation I found that I was adjusting the wrong panel.csv file (the one in the cpout folder instead of the correct file in the config folder). I am now able to complete the pipleline. Thank you!

Have a wonderful day

Jesse