This is a great package and thank you for making it available here. It really helped a novice in scanpy and anndata like myself quite a lot.
I just wanted to point out that when I set the use_raw option to True, I got the following error
ValueError: Shape of passed values is (325, 25809), indices imply (325, 1153)
Which I think is because while forming the DataFrame for the pseudo bulk matrix, the transform function is setting the columns to the variable genes from the processed X assay of the AnnData object which there are only 1153 for my data. I came up with the following work around messy transform function that sets the columns to genes from the raw data if use_raw is set as True.
def customTransform(self) -> pd.DataFrame:
"""
performs the aggregation based on the fit indices
"""
if not self._isfit:
raise AttributeError("Please fit the object first")
matrix = []
for pairs in tqdm(self.groupings, desc="Aggregating Samples"):
if not isinstance(pairs, tuple):
pairs = tuple([pairs])
if pairs in self.grouping_masks:
matrix.append(self._get_agg(self.grouping_masks[pairs]))
# stack all observations into single matrix
matrix = np.vstack(matrix)
if self.use_raw:
self.matrix = pd.DataFrame(
matrix,
index=self.meta.SampleName.values,
columns=self.adat.raw.var.index.values)
else:
self.matrix = pd.DataFrame(
matrix,
index=self.meta.SampleName.values,
columns=self.adat.var.index.values)
self._istransform = True
return self.matrix
It's a simple fix but I thought it might be nice to have a reference for it here for future users.
Hi Noam,
This is a great package and thank you for making it available here. It really helped a novice in scanpy and anndata like myself quite a lot.
I just wanted to point out that when I set the use_raw option to True, I got the following error
ValueError: Shape of passed values is (325, 25809), indices imply (325, 1153)
Which I think is because while forming the DataFrame for the pseudo bulk matrix, the transform function is setting the columns to the variable genes from the processed X assay of the AnnData object which there are only 1153 for my data. I came up with the following work around messy transform function that sets the columns to genes from the raw data if use_raw is set as True.
It's a simple fix but I thought it might be nice to have a reference for it here for future users.
Best, Alper