ZJU-UoE-CCW-LAB / scCDC

single-cell Contamination Detection and Correction
GNU General Public License v3.0
6 stars 0 forks source link

Cannot find the following identities in the object: g15 #4

Open changostraw opened 4 months ago

changostraw commented 4 months ago

Hello thanks for the tool!

I get an error during calculation of the contamination ratio: Caculating entropy... |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=28s
Caculating expression level... |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=25s
Calculating entropy-expression relation... |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=15s
Extracting contamination degree... |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s
Complete detection. 7 contaminated genes found

contamination_ratio = ContaminationQuantification(seuratobject,rownames(GCGs)) Calculating contamination ratio... | | 0 % ~calculating Error in WhichCells.Seurat(object, idents = cluster) : Cannot find the following identities in the object: g15

seuratobj_corrected = ContaminationCorrection(seuratobject,rownames(GCGs)) Calculating correction threshold... | | 0 % ~calculating Error in WhichCells.Seurat(object, idents = cluster) : Cannot find the following identities in the object: g15

I am not sure what g15 identities are. My Seurat object has clusters, PCA and UMAP: An object of class Seurat 28372 features across 11189 samples within 1 assay Active assay: RNA (28372 features, 2000 variable features) 3 layers present: counts, data, scale.data 2 dimensional reductions calculated: pca, umap

Am I missing some preprocessing? Thanks!

Stephen1202-Wang commented 4 months ago

Hi @changostraw , Thank you for using scCDC. We've developed this version to be compatible with Seurat V4, and we've noticed some functions might not work as expected with the newly released Seurat V5. From the log file and the details you've provided, it seems the issue could be related to the version of Seurat you're using, or it might stem from the data itself.

Could you please send us a downsampled version of the dataset that's causing the problem? Then we can investigate the issue in. You can create a downsampled dataset using the following command:

seuratobject_downsample<-subset(seuratobject,downsample=100)

changostraw commented 4 months ago

Thanks for getting back to me! I have Seurat 4 loaded when I get this error:

Seurat "4.3.1"

SeuratData "4.3.1"

SeuratObject "4.3.1"

SeuratWrappers "4.3.1"

Attached is the downsampled seuratobject.

Thanks!

INS_PoolB_downsample.rds https://drive.google.com/file/d/1P8FJMLBnZ7Hx-uFVGgI_G6FSvCpq4rmZ/view?usp=drive_web

On Fri, Apr 12, 2024 at 8:43 PM Weijian Wang @.***> wrote:

Hi @changostraw https://github.com/changostraw , Thank you for using scCDC. We've developed this version to be compatible with Seurat V4, and we've noticed some functions might not work as expected with the newly released Seurat V5. From the log file and the details you've provided, it seems the issue could be related to the version of Seurat you're using, or it might stem from the data itself.

Could you please send us a downsampled version of the dataset that's causing the problem? Then we can investigate the issue in. You can create a downsampled dataset using the following command:

seuratobject_downsample<-subset(seuratobject,downsample=100)

— Reply to this email directly, view it on GitHub https://github.com/ZJU-UoE-CCW-LAB/scCDC/issues/4#issuecomment-2052738572, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARNFOUPAK3OFDANZT2S75DTY5B5RNAVCNFSM6AAAAABGEUY7FKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJSG4ZTQNJXGI . You are receiving this because you were mentioned.Message ID: @.***>

Ariscen commented 4 months ago

Hi @changostraw, Thanks for your sharing! We have debugged our pipeline using your downsampled dataset.

With Seurat 5, we have successfully reproduced the error about "Cannot find the following identities in the object: g15". The reason behind this is rooted in the different output data frame of the Seurat::AverageExpression() function in two Seurat versions. Compared to Seurat 4 that directly uses the cluster ids from the active.ident as the column names of the output, if the cluster ids are pure numbers, Seurat 5 will add a letter "g" in front of the cluster ids as the column names of the output . Thus, the column names of the output of the Seurat::AverageExpression() function will contain cluster names like "g15" and then cause error because the cluster ids in the dataset are pure numbers like "15" instead of "g15".

With Seurat 4, we found no errors occurred. In terms of the messages got when loading the Seurat 4, some warning messages may occur when you load any packages with a higher version of R but it should be okay.

Besides, we are working on updating the scCDC package to be compatible with the current Seurat 5 as soon as possible. Before the update, our suggestion is to still use Seurat 4 to run the scCDC package in case that other incompatibility issues occur.

Thanks for your messages again! Please let us know if you need any other support.

changostraw commented 4 months ago

Thanks for responding but as I said previously I am not using Seurat 5. I only have Seurat 4.3.1 installed and r/4.3

On Mon, Apr 15, 2024 at 10:51 PM Yihui Cen @.***> wrote:

Hi @changostraw https://github.com/changostraw, Thanks for your sharing! We have debugged our pipeline using your downsampled dataset.

With Seurat 5, we have successfully reproduced the error about "Cannot find the following identities in the object: g15". The reason behind this is rooted in the different output data frame of the Seurat::AverageExpression() function in two Seurat versions. Compared to Seurat 4 that directly uses the cluster ids from the active.ident as the column names of the output, if the cluster ids are pure numbers, Seurat 5 will add a letter "g" in front of the cluster ids as the column names of the output . Thus, the column names of the output of the Seurat::AverageExpression() function will contain cluster names like "g15" and then cause error because the cluster ids in the dataset are pure numbers like "15" instead of "g15".

With Seurat 4, we found no errors occurred. In terms of the messages got when loading the Seurat 4, some warning messages may occur when you load any packages with a higher version of R but it should be okay.

Besides, we are working on updating the scCDC package to be compatible with the current Seurat 5 as soon as possible. Before the update, our suggestion is to still use Seurat 4 to run the scCDC package in case that other incompatibility issues occur.

Thanks for your messages again! Please let us know if you need any other support.

— Reply to this email directly, view it on GitHub https://github.com/ZJU-UoE-CCW-LAB/scCDC/issues/4#issuecomment-2058137034, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARNFOUJKHITMRL4T6UK7A4LY5SGZ7AVCNFSM6AAAAABGEUY7FKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJYGEZTOMBTGQ . You are receiving this because you were mentioned.Message ID: @.***>

changostraw commented 4 months ago

Sorry my mistake. I checked again and I do have Seurat 5 installed. That must have occurred just this week when I installed another package with a dependency and it automatically updated Seurat, as I was keeping Seurat downgraded to 4 on purpose as I found 5 was too buggy. I have downgraded Seurat to 4 again and it appears to be working now. Thanks!

On Tue, Apr 16, 2024 at 8:34 AM Ch'ang-o Strawberry @.***> wrote:

Thanks for responding but as I said previously I am not using Seurat 5. I only have Seurat 4.3.1 installed and r/4.3

On Mon, Apr 15, 2024 at 10:51 PM Yihui Cen @.***> wrote:

Hi @changostraw https://github.com/changostraw, Thanks for your sharing! We have debugged our pipeline using your downsampled dataset.

With Seurat 5, we have successfully reproduced the error about "Cannot find the following identities in the object: g15". The reason behind this is rooted in the different output data frame of the Seurat::AverageExpression() function in two Seurat versions. Compared to Seurat 4 that directly uses the cluster ids from the active.ident as the column names of the output, if the cluster ids are pure numbers, Seurat 5 will add a letter "g" in front of the cluster ids as the column names of the output . Thus, the column names of the output of the Seurat::AverageExpression() function will contain cluster names like "g15" and then cause error because the cluster ids in the dataset are pure numbers like "15" instead of "g15".

With Seurat 4, we found no errors occurred. In terms of the messages got when loading the Seurat 4, some warning messages may occur when you load any packages with a higher version of R but it should be okay.

Besides, we are working on updating the scCDC package to be compatible with the current Seurat 5 as soon as possible. Before the update, our suggestion is to still use Seurat 4 to run the scCDC package in case that other incompatibility issues occur.

Thanks for your messages again! Please let us know if you need any other support.

— Reply to this email directly, view it on GitHub https://github.com/ZJU-UoE-CCW-LAB/scCDC/issues/4#issuecomment-2058137034, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARNFOUJKHITMRL4T6UK7A4LY5SGZ7AVCNFSM6AAAAABGEUY7FKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJYGEZTOMBTGQ . You are receiving this because you were mentioned.Message ID: @.***>

Dot4diw commented 1 month ago

If you are using Seurat5 the following code can solve the problem:

obj$gclusters <- paste0("g", obj$seurat_clusters)
Idents(obj) <- obj$gclusters
contamination_ratio = ContaminationQuantification(obj,rownames(GCGs))