f-hamidlab / nuclearpy

MIT License
0 stars 0 forks source link

ngs.add_nuclear_features() #12

Closed Marcel-Salier closed 2 years ago

Marcel-Salier commented 2 years ago

It gets stuck when reach the file number 24th:

ngs.add_nuclear_features() 83%|███████████████████████████████████▌ | 24/29 [00:12<00:02, 1.87it/s]

ValueError Traceback (most recent call last) Input In [17], in <cell line: 1>() ----> 1 ngs.add_nuclear_features()

File /opt/anaconda3/envs/ngtools/lib/python3.10/site-packages/ngtools/segmentation.py:1622, in NuclearGame_Segmentation.add_nuclear_features(self) 1620 image = self.data["files"][file]['working_array'][self.data["channels_info"][ch]] 1621 if ch == self.data["dna_marker"]: -> 1622 out_nuclear_layers = nucleus_layers_fast(image, mask, 1623 xscale = self.data["files"][file]['metadata']['XScale']) 1624 self.data["files"][file]["nuclear_features"][f"avg_intensitycore{ch}"] = out_nuclear_layers[1] 1625 self.data["files"][file]["nuclear_features"][f"avg_intensity_internalring{ch}"] = out_nuclear_layers[3]

File /opt/anaconda3/envs/ngtools/lib/python3.10/site-packages/ngtools/segmentation.py:734, in nucleus_layers_fast(image, mask, xscale) 731 area_before = area_after 732 area_after = [ceil(core_mask_props[n]['area']) for n in range(len(core_mask_props))] --> 734 to_erode = (np.array(area_after) != np.array(area_before))&(np.array(area_after) > np.array(area_0)/2) 737 core = core_mask 738 core_props = regionprops(core, intensity_image=image)

ValueError: operands could not be broadcast together with shapes (101,) (102,)

fursham-h commented 2 years ago

That is not the updated code. I have added an additional line before line 732, which was not reflected in the error you pasted.

Try running pip install . using terminal.

Marcel-Salier commented 2 years ago

I think I still have problems with the update I try again and same error:

83%|███████████████████████████████████▌ | 24/29 [00:12<00:02, 1.95it/s]

ValueError Traceback (most recent call last) Input In [13], in <cell line: 1>() ----> 1 ngs.add_nuclear_features()

File /opt/anaconda3/envs/ngtools/lib/python3.10/site-packages/ngtools/segmentation.py:1622, in NuclearGame_Segmentation.add_nuclear_features(self) 1620 image = self.data["files"][file]['working_array'][self.data["channels_info"][ch]] 1621 if ch == self.data["dna_marker"]: -> 1622 out_nuclear_layers = nucleus_layers_fast(image, mask, 1623 xscale = self.data["files"][file]['metadata']['XScale']) 1624 self.data["files"][file]["nuclear_features"][f"avg_intensitycore{ch}"] = out_nuclear_layers[1] 1625 self.data["files"][file]["nuclear_features"][f"avg_intensity_internalring{ch}"] = out_nuclear_layers[3]

File /opt/anaconda3/envs/ngtools/lib/python3.10/site-packages/ngtools/segmentation.py:734, in nucleus_layers_fast(image, mask, xscale) 731 area_before = area_after 732 area_after = [ceil(core_mask_props[n]['area']) for n in range(len(core_mask_props))] --> 734 to_erode = (np.array(area_after) != np.array(area_before))&(np.array(area_after) > np.array(area_0)/2) 737 core = core_mask 738 core_props = regionprops(core, intensity_image=image)

ValueError: operands could not be broadcast together with shapes (101,) (102,)

Marcel-Salier commented 2 years ago

I open terminal but it's in base environment not in ngtools. How is the command to change it? I run it again and the same error, but the files in the folder there were updated.

fursham-h commented 2 years ago

I'll help you tomorrow in-person. I found a hack that will make sure you don't have to re run pip install

Marcel-Salier commented 2 years ago

It went through now there is an error saving the data:

Export nuclear features measured as CSV In [24]:

ngs.export_csv(filename = "outputD3GFAPLbPh.csv")

ValueError Traceback (most recent call last) Input In [24], in <cell line: 1>() ----> 1 ngs.export_csv(filename = "outputD3GFAPLbPh.csv")

File ~/Documents/GitHub/ng_tools/ngtools/segmentation.py:2070, in NuclearGame_Segmentation.export_csv(self, filename) 2067 for ft in lst_fts: 2068 dct_df[ft] = [l for file in self.data["files"] for l in self.data["files"][file]["nuclear_features"][ft]] -> 2070 df_out = pd.DataFrame.from_dict(data = dct_df) 2071 df_out.to_csv(self.path_save + filename, index = False) 2073 print(f"CSV file saved as: {self.path_save + filename}.")

File /opt/anaconda3/envs/ngtools/lib/python3.10/site-packages/pandas/core/frame.py:1677, in DataFrame.from_dict(cls, data, orient, dtype, columns) 1674 raise ValueError("only recognize index or columns for orient") 1676 if orient != "tight": -> 1677 return cls(data, index=index, columns=columns, dtype=dtype) 1678 else: 1679 realdata = data["data"]

File /opt/anaconda3/envs/ngtools/lib/python3.10/site-packages/pandas/core/frame.py:636, in DataFrame.init(self, data, index, columns, dtype, copy) 630 mgr = self._init_mgr( 631 data, axes={"index": index, "columns": columns}, dtype=dtype, copy=copy 632 ) 634 elif isinstance(data, dict): 635 # GH#38939 de facto copy defaults to False only in non-dict cases --> 636 mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager) 637 elif isinstance(data, ma.MaskedArray): 638 import numpy.ma.mrecords as mrecords

File /opt/anaconda3/envs/ngtools/lib/python3.10/site-packages/pandas/core/internals/construction.py:502, in dict_to_mgr(data, index, columns, dtype, typ, copy) 494 arrays = [ 495 x 496 if not hasattr(x, "dtype") or not isinstance(x.dtype, ExtensionDtype) 497 else x.copy() 498 for x in arrays 499 ] 500 # TODO: can we get rid of the dt64tz special case above? --> 502 return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)

File /opt/anaconda3/envs/ngtools/lib/python3.10/site-packages/pandas/core/internals/construction.py:120, in arrays_to_mgr(arrays, columns, index, dtype, verify_integrity, typ, consolidate) 117 if verify_integrity: 118 # figure out the index, if necessary 119 if index is None: --> 120 index = _extract_index(arrays) 121 else: 122 index = ensure_index(index)

File /opt/anaconda3/envs/ngtools/lib/python3.10/site-packages/pandas/core/internals/construction.py:674, in _extract_index(data) 672 lengths = list(set(raw_lengths)) 673 if len(lengths) > 1: --> 674 raise ValueError("All arrays must be of the same length") 676 if have_dicts: 677 raise ValueError( 678 "Mixing dicts with non-Series may lead to ambiguous ordering." 679 )

ValueError: All arrays must be of the same length

fursham-h commented 2 years ago

Made a quickfix for this (b1dffde8cfeade9258a73c59d58f4f2a92c717fb). I am sure it's with the spatial_entropy part of the code (I could have sworn that all nuclei should have a value for this).

fursham-h commented 2 years ago

If error still persists, could you run the following code in a new Code cell in the notebook (in the same notebook that you are running the segmentation).

lst_fts = ngs.get_lst_features()
dct_df = {}
for ft in lst_fts:
    dct_df[ft] = [l for file in ngs.data["files"] for l in self.data["files"][file]["nuclear_features"][ft]]
for key, values in dct_df.items():
    print(f"{key} : {len(values)} ") 
Marcel-Salier commented 2 years ago

The error still persist I run the code in a new code cell at the bottom and I got this:

lst_fts = ngs.get_lst_features() dct_df = {} for ft in lst_fts: dct_df[ft] = [l for file in ngs.data["files"] for l in self.data["files"][file]["nuclear_features"][ft]] for key, values in dct_df.items(): print(f"{key} : {len(values)} ")


NameError Traceback (most recent call last) Input In [24], in <cell line: 3>() 2 dct_df = {} 3 for ft in lst_fts: ----> 4 dct_df[ft] = [l for file in ngs.data["files"] for l in self.data["files"][file]["nuclear_features"][ft]] 5 for key, values in dct_df.items(): 6 print(f"{key} : {len(values)} ")

Input In [24], in (.0) 2 dct_df = {} 3 for ft in lst_fts: ----> 4 dct_df[ft] = [l for file in ngs.data["files"] for l in self.data["files"][file]["nuclear_features"][ft]] 5 for key, values in dct_df.items(): 6 print(f"{key} : {len(values)} ")

NameError: name 'self' is not defined

fursham-h commented 2 years ago

My bad, typo. Try this

lst_fts = ngs.get_lst_features()
dct_df = {}
for ft in lst_fts:
    dct_df[ft] = [l for file in ngs.data["files"] for l in ngs.data["files"][file]["nuclear_features"][ft]]
for key, values in dct_df.items():
    print(f"{key} : {len(values)} ") 
fursham-h commented 2 years ago

I found the bug. Fixed it in the latest push 21a7b7d0f5cedf340cb624ed0412344d8d121366.

Marcel-Salier commented 2 years ago

I Pulled it and the latest modification in the segmentation is from 13hrs ago is that alright? or you change something this morning?

Marcel-Salier commented 2 years ago

Good news. It works with the snap 360!!!! Now, I will run the whole lot to see.

Marcel-Salier commented 2 years ago

I got an error at the calculation of peaks and dots:

KeyError Traceback (most recent call last) Input In [36], in <cell line: 1>() ----> 1 ngs.find_dna_peaks(box_size = 10, zoom_box_size = 200) 2 ngs.find_dna_dots(zoom_box_size = 200)

File ~/Documents/GitHub/ng_tools/ngtools/segmentation.py:1711, in NuclearGame_Segmentation.find_dna_peaks(self, box_size, zoom_box_size) 1707 nucleus = self.data["files"][file]['working_array'][ 1708 self.data["channels_info"][self.data["dna_marker"]]].copy() 1709 nucleus[masks == 0] = 0 -> 1711 th = self.data["files"][file]["th_array"] 1714 ignore_mask = np.zeros(masks.shape) 1715 ignore_mask[masks == 0] = True

KeyError: 'th_array'

Marcel-Salier commented 2 years ago

I was able to continue but I got an error of the DNA peaks at the end:


KeyError Traceback (most recent call last) Input In [41], in <cell line: 1>() ----> 1 ngs.export_csv(filename = "outputD3GFAPLbPh.csv")

File ~/Documents/GitHub/ng_tools/ngtools/segmentation.py:2063, in NuclearGame_Segmentation.export_csv(self, filename) 2060 dct_df = {} 2062 for ft in lst_fts: -> 2063 dct_df[ft] = [l for file in self.data["files"] for l in self.data["files"][file]["nuclear_features"][ft]] 2065 df_out = pd.DataFrame.from_dict(data = dct_df) 2066 df_out.to_csv(self.path_save + filename, index = False)

File ~/Documents/GitHub/ng_tools/ngtools/segmentation.py:2063, in (.0) 2060 dct_df = {} 2062 for ft in lst_fts: -> 2063 dct_df[ft] = [l for file in self.data["files"] for l in self.data["files"][file]["nuclear_features"][ft]] 2065 df_out = pd.DataFrame.from_dict(data = dct_df) 2066 df_out.to_csv(self.path_save + filename, index = False)

KeyError: 'dna_peaks'

fursham-h commented 2 years ago

I got an error at the calculation of peaks and dots:

Okay this was my bad. Introduced and indentation bug. This should fix it

fursham-h commented 2 years ago

I got an error at the calculation of peaks and dots:

Okay this was my bad. Introduced and indentation bug. This should fix it

Marcel-Salier commented 2 years ago

I got this when import at the beginning:

Traceback (most recent call last):

File /opt/anaconda3/envs/ngtools/lib/python3.10/site-packages/IPython/core/interactiveshell.py:3397 in run_code exec(code_obj, self.user_global_ns, self.user_ns)

Input In [1] in <cell line: 1> import ngtools.segmentation as ngt

File ~/Documents/GitHub/ng_tools/ngtools/segmentation.py:1312 self.data['files'][file]["masks"] = removenuclei(self.data['files'][file]["masks"]) ^ TabError: inconsistent use of tabs and spaces in indentation

fursham-h commented 2 years ago

Well, so much for editing code on a phone during day-off. I fixed the indentation error again and have run the pipeeline on "D3 BLEBB GFAPLbPh" without any errors. Fingers crossed.

Marcel-Salier commented 2 years ago

Great job Fursham!!! It works smoothly for both folders! Tomorrow I will put them in the analyzer! :)

fursham-h commented 2 years ago

Great. I noticed that you changed the output filename, analyzor may raise an error during import.

ngs.export_csv(filename ="outputD3GFAPLbPh.csv")

I will fix this soon, and close this issue. Open a new issue for other errors.