MagpiePKU / EpiTrace

Cell age determination by scATAC-seq and bulk-ATAC-seq
https://epitrace.readthedocs.io
GNU General Public License v3.0
24 stars 1 forks source link

Erro:ref clock list is not standard. #9

Closed QiongZhao closed 2 weeks ago

QiongZhao commented 2 weeks ago

I used my own scATAC-seq data from the mouse pbmc to calculate the cell age, and the count was based on mm10. But when it runs, it says ”please make double sure your ref genome, peak set and cells are similar. Preparing obj... ref clock list is not standard. Please make sure the input data, peak set and clock set are in similar reference genome . Input peakset is set to be hg19“ I have checked the reference genome version of 10X cellranger-atac pinepline and the annotation version in signac, and there is no problem. So I used the data and sample code from the official tutorial chapter "Mouse T cell aging in chronic viral infection", When the following example code is run, it still displays "ref clock list is not standard. Please make sure the input data, peakset and clock set are in similar reference genome.Input peakset is set to be hg19. "Is there a problem? Thank you! The sample code is as follows mouse_clock_by_MM285<-readRDS('mouse_clock_mm285_design_clock347_mm10.rds') peaks_df <-read_tsv('GSE164978_scATAC_peaks.bed.gz',col_names=c('chr','start','end')) cells <- read_tsv('GSE164978_scATAC_barcodes.tsv.gz', col_names=c('cell'))
mm <- Matrix::readMM(gzfile('GSE164978_scATAC_matrix.mtx.gz')) peaks_df$peaks <- paste0(peaks_df$chr,'-',peaks_df$start,'-',peaks_df$end) peaks_df$peaks <- paste0(peaks_df$chr,'-',peaks_df$start,'-',peaks_df$end) EpiTrace::Init_Peakset(peaks_df) -> init_gr EpiTrace::Init_Matrix(peakname = peaks_df$peaks,cellname = cells$cell,matrix = mm) -> init_mm EpiTraceAge_Convergence(peakSet =init_gr,matrix=init_mm,ref_genome='mm10',clock_gr=mouse_clock_by_MM285,iterative_time = 5,min.cutoff = 0,non_standard_clock = T,qualnum = 10,ncore_lim = 48,mean_error_limit = 0.1) -> epitrace_obj_age_conv_estimated_by_mouse_clock

MagpiePKU commented 2 weeks ago

Hi Qiong,

Thanks for the note. The message simply states that there is no genome conversion. The output result should be correct (if your input PeakMatrix and the reference clock-like loci are from similar reference genome, then there would be no conversion in this setting, and they should have correct output).

To resolve the inaccurate message, we have corrected the code. Please check the dev branch using pak::pkg_install('MagpiePKU/EpiTrace@dev') . Otherwise you can just neglect the message.

Best, Yi

QiongZhao commented 2 weeks ago

Thanks for reply! when i try to run the pak::pkg_install('MagpiePKU/EpiTrace@dev'), Error in parse(outFile) :

 /tmp/RtmpsZpSF7/R.INSTALLccc3142a28c4/EpiTrace/R/EpiTrace.R:122:194: unexpected symbol
121:     }else{
122:       message('Reference clock list is not converted. Default standard clock reference loci is Homo sapiens hg19. If given clock_gr_list is non-human, then it is not converted and follows user
                                                                                                                                                                                                      ^
ERROR: unable to collate and parse R files for package ‘EpiTrace’

It seems that the string in the message function on line 122 is not closed properly. As long as I ensure that the reference genome version of the input peakMatrix matches that of clockDML, I can disregard the 'set the peakset to hg19' instruction when using the release version of EpiTrace, right? Thank you!

MagpiePKU commented 2 weeks ago

Thanks for reply! when i try to run the pak::pkg_install('MagpiePKU/EpiTrace@dev'), Error in parse(outFile) :

/tmp/RtmpsZpSF7/R.INSTALLccc3142a28c4/EpiTrace/R/EpiTrace.R:122:194: unexpected symbol
121:     }else{
122:       message('Reference clock list is not converted. Default standard clock reference loci is Homo sapiens hg19. If given clock_gr_list is non-human, then it is not converted and follows user
                                                                                                                                                                                                     ^
ERROR: unable to collate and parse R files for package ‘EpiTrace’

It seems that the string in the message function on line 122 is not closed properly.

Thanks for the quick check. The string is incorrect (the ' is not commented). It's now re-written.

As long as I ensure that the reference genome version of the input peakMatrix matches that of clockDML, I can disregard the 'set the peakset to hg19' instruction when using the release version of EpiTrace, right? Thank you!

Yes. The message could be simply ignored. Apologies for the misunderstanding. The code was originally only designed for human genome and latter extended to non-human species.