digitalcytometry / ecotyper

EcoTyper is a machine learning framework for large-scale identification of cell states and cellular ecosystems from gene expression data.
Other
184 stars 42 forks source link

Error when running ecoTyper in discovery mode on the bulk Staudt dataset #98

Open ArtemisPapadaki opened 1 week ago

ArtemisPapadaki commented 1 week ago

Dear EcoTyper Creators, Thank you for creating EcoTyper, it is an amazing tool !

We are trying to use EcoTyper for characterising B cell states and ecosystems, starting from reproducing the results in the Steen et al (2021) Cancer Cell paper.

We have been able to run the EcoTyper pipeline on the Imperial College HPC for single cell RNA in discovery mode. But we keep facing problems in making it work for bulk RNA seq in discovery mode. We need to create the model, so that we will be able to run the scRNA pipeline in recovery mode.

We understand that Docker is needed to run CIBERSORTx Imperial College HPC does not support Docker, so we tried to use Singularity.

The images have been downloaded directly from dockerhub via singularity and converted them automatically, with the following commands: singularity pull docker://cibersortx/hires singularity pull docker://cibersortx/fractions

The path to the files: fractions_latest.sif hires_latest.sif

has been provided to the configuration file. We also had to amend some commands related to Docker at the pipeline (trial and error process).

The scripts used and the out, error files output from the pipeline are attached.

The following error (can be seen in file: 25760.pbs-7.ER ) stops the pipeline from running:

25760.pbs-7.ER Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'unique': undefined columns selected Calls: heatmap_simple ... lapply -> lapply -> FUN -> unique -> [ -> [.data.frame Execution halted Error in RunJobQueue() : EcoTyper failed. Please check the error message above!

These are few warnings above that error: Warning messages: 1: The size argument of element_line() is deprecated as of ggplot2 3.4.0. Please use the linewidth argument instead. 2: The size argument of element_rect() is deprecated as of ggplot2 3.4.0. Please use the linewidth argument instead.

Does this suggest we are using some commands that might be old or out of date, that could be the cause of the issues ?

We believe the actual error is occuring on line 67 of the file "pipeline/state_discovery_initial_plots.R": p <- heatmap_simple(data, top_annotation = top_ann, top_columns = top_cols, column_title = lookup_celltype(cell_type),

It always seems to fail for "Tregs": Extracting cell states information for: Tregs

Even in previous runs, it failed at that point. Is there anything is being passed into that line at that part of the loop where "Tregs" is, that's causing that?

Could you please advise how we could make the pipeline work? Thank you very much for your help in advance!