Lan-lab / SIGNET

4 stars 5 forks source link

Batch correction and matrix combining #6

Closed VUbels closed 1 year ago

VUbels commented 1 year ago

Hi , First and foremost many thanks for developing this amazing tool. I'm very keen on using it on various datasets from our lab. My question is rather basic but I had some trouble finding the solution myself as I am still quite new to python. I currently have 3 scRNA datasets of the same tissue that each contain ~6000 cells. In previous Seurat implementations I performed quality control on these separately, normalized the data, then found integration anchors and integrated the data to one large Seurat object. What would be the correct way of combining the three datasets I currently have in order to ensure the highest accuracy from SIGNET analysis?

Additionally, when I try to run the SIGNET.py file on a scRNA matrix I keep running into a problem with the binarization of the dataset.

`

TypeError Traceback (most recent call last) Cell In[21], line 122 120 for i in range(raw_data_ntf.shape[0]): 121 gene_expr = raw_data_ntf.loc[gene_ntf[i], ] --> 122 marker = [0 for x in range(int(gene_expr.max())+1)] 123 record = [] 124 value = []

File ~\AppData\Local\R-MINI~1\lib\site-packages\pandas\core\series.py:206, in _coerce_method..wrapper(self) 204 if len(self) == 1: 205 return converter(self.iloc[0]) --> 206 raise TypeError(f"cannot convert the series to {converter}")

TypeError: cannot convert the series to <class 'int'> ` As I mentioned, Im fairly new to python so Im unsure on how to resolve this issue. The scRNA matrix I'm using looks as followed and should work. I export the file as a simple .csv file and when opened does fit the requirements where row names are the genes and column headers are the cell barcodes:

image

Any help would be immensely appreciated,

Kind regards