pinellolab / dictys

Context specific and dynamic gene regulatory network reconstruction and analysis
GNU Affero General Public License v3.0
110 stars 14 forks source link

ValueError: Subset RNA cells not found in population #27

Closed orbitalse closed 1 year ago

orbitalse commented 1 year ago

Hello Lingfei,

I am encountering the following error when trying to run my data through Dictys. I am at the makefiles step, and this error is generated when I run makefile_check.py to check that everything is in order with the input data before starting the GRN inference.

ValueError: Subset RNA cells not found in population

I am using unpaired snRNA-seq and snATAC-seq data, coming from the same donor but not performed jointly in the same cells. I have already formatted the snRNA-seq expression matrix as well as the clusters/subsets for both snRNA and snATAC according to how it is done in the full-multiome tutorial.

Would you be able to elaborate on what this error means? I've tried debugging the code to trace back the source of the error but to no avail.

Many thanks in advance for your help.

lingfeiwang commented 1 year ago

Hi orbitalse,

Thank you for the question.

In your input folder, you should have subsets/*/names_rna.txt including cell names contained in each subpopulation. This error means certain cell names in these files cannot be found in your expression matrix file expression.tsv.gz. You need to ensure all cells mentioned in subsets/*/names_rna.txt have their expression profiles available in expression.tsv.gz.

We will make this clear in the error message in the next version. Please reopen the issue if the error persists.

Lingfei

orbitalse commented 1 year ago

Many thanks for your timely reply! I was able to resolve the issue. The column names of the expression matrix didn't exactly match the cell barcodes in subsets/*/names_rna.txt, which caused the error.