yezhengSTAT / ADTnorm

ADTnorm normalizes the cell surface protein measurement of CITE-seq data, facilitating across batches and across studies data integration.
https://yezhengstat.github.io/ADTnorm/articles/ADTnorm-tutorial.html
GNU General Public License v3.0
19 stars 4 forks source link

Error in (function (unregfd, ximarks, x0marks, x0lim = NULL, WfdPar = NULL, #12

Open HsiaoChiLiao opened 1 year ago

HsiaoChiLiao commented 1 year ago

Hi Ye,

I was trying to run your ADTnorm with a dataset of 15820 cells and 207 features (198 proteins + 9 isotype controls).

The function for normalisation:

  adt206_adt_adtnorm <- ADTnorm(
    cell_x_adt = adt206_raw_adt_ctl,  #Matrix of ADT raw counts in cells (rows) by ADT markers (columns) format.
    cell_x_feature = cell_x_feature, #Matrix of cells (rows) by cell features (columns) such as sample, batch, or other cell-type related information.
    save_outpath = outpath, 
    study_name = "adt198", 
    marker_to_process = NULL, 
    save_fig = TRUE
  )

However, I ran into the below error:

Progress:  Each dot is a curve
..
Progress:  Each dot is a curve
..
Progress:  Each dot is a curve
..
Error in (function (unregfd, ximarks, x0marks, x0lim = NULL, WfdPar = NULL,  : 
  Argument ximarks has values outside of range of unregfd.
In addition: Warning messages:
1: Removed 1 rows containing missing values (`geom_segment()`). 
2: Removed 1 rows containing missing values (`geom_segment()`). 

Below I'm showing the input data (hope it will help). image

image

Thank you and I'm looking forward to hearing from you.

Warm regards, Hsiao-Chi

yezhengSTAT commented 1 year ago

Hello Hsiao-Chi,

Thanks for using ADTnorm. Can I confirm with you that you have been using the latest version? In previous months, we uploaded a few changes. Otherwise, can you narrow down to the marker that led to the error? Is it an IgG marker? Does it contain a lot of NA? Also, please check the value and density plot to see if such a marker only contains a negative peak or if such a marker only has a small number of unique and low values.

I remember seeing such an error before but haven't come across it for a while using the data on my side. Therefore, I am thinking maybe certain data characteristics may trigger such an error......

Thanks, Ye

ansonrel commented 5 months ago

Hi,

I have a similar issue as mentioned above (with ADTnorm version 1.0) . I have tried to see is I had any missing values or a lot of zero values but couldn't find anything abnormal. Disclaimer: A possible reason is that I'm using a subset of an entire dataset (1000 cells) as I'm testing a complex pipeline. Still, I'm puzzled what causes the error and how to automatically avoid it when running my entire pipeline.

Here, the 3 first markers lead to the argument ximarks has values outside of range of unregfd. error, the other markers are running fine:

> apply(cell_x_adt, 2, function(x) summary(x))
              CD2   CD45RA      CD28  AnnexinV      BTLA       CD117      CD123       CD13     CD133
Min.      0.00000   1.0000   0.00000   0.00000  0.000000   0.0000000   0.000000   0.000000   0.00000
1st Qu.  19.00000  21.0000   9.00000  11.00000  1.000000   0.0000000   2.000000   2.000000   1.00000
Median   87.00000  71.0000  30.00000  17.00000  2.000000   0.0000000   3.000000   3.000000   1.00000
Mean     96.26253 114.2895  45.39019  21.54758  2.685641   0.7667799   9.667799   8.988955   2.32158
3rd Qu. 140.00000 184.0000  71.00000  25.00000  3.000000   1.0000000   6.000000   5.000000   3.00000
Max.    513.00000 646.0000 274.00000 245.00000 33.000000 170.0000000 824.000000 235.000000 240.00000
            CD134
Min.     0.000000
1st Qu.  3.000000
Median   4.000000
Mean     4.801402
3rd Qu.  6.000000
Max.    57.000000
> apply(cell_x_adt, 2, function(x) sum(is.na(x)))
     CD2   CD45RA     CD28 AnnexinV     BTLA    CD117    CD123     CD13    CD133    CD134 
       0        0        0        0        0        0        0        0        0        0 

And the resulting density plots:

plot_zoom_png

I'm not sure what you mean with 'a small number of unique and low values' ? Do you know how I could check this ?

Thank for your help, Anthony

yezhengSTAT commented 5 months ago

Hello Anthony,

If you are having problems with CD2, CD45RA and CD28 instead of CD117, CD123, CD13, CD133, indeed, there must be something wrong with the program. For now, can you try quantile_clip = 0.99 to remove the extremely large values, while I am looking into this issue more closely?

Thanks, Ye

ansonrel commented 4 months ago

Sorry for the late answer, I missed somehow the notification of your answer!

So the quantile_clip = 0.99 does resolve the problem for some markers, but still fails with others (and the same error message).

Ah and I had an error with quantile_clip = 0.99 , that I think is resolved in this PR https://github.com/yezhengSTAT/ADTnorm/pull/14#issue-2252971501