AllenInstitute / scrattch.mapping

Genearlized mapping scripts for RNA-seq and Patch-seq data
5 stars 2 forks source link

Human MTG mapping 'cannot open connection" error #33

Open ru57y34nn opened 1 year ago

ru57y34nn commented 1 year ago

I am following the "Build and map against a human MTG PatchSeq taxonomy" example and trying to map human patch-seq data against AIT15.2 with the latest scrattch-mapping docker image (scrattch_mapping_latest) on HPC but I am consistently running into connection issue when running the taxonomy_mapping() function. I am loading the taxonomy from here, "//allen/programs/celltypes/workgroups/rnaseqanalysis/shiny/Taxonomies/AIT15.2/AI_taxonomy.h5ad" using loadTaxonomy() and then setting the mode to "patchseq" with mappingMode(); however, when I then run taxonomy_mapping() I am running into this error message:

[1] "Error caught for Tree mapping." <simpleError in gzfile(file, "rb"): cannot open the connection> Error in normalizePath(path, mustWork = TRUE) : path[1]="\\allen/programs/celltypes/workgroups/rnaseqanalysis/shiny/Taxonomies/AIT15.2/medians.feather": No such file or directory In addition: Warning messages: 1: replacing previous import ‘mfishtools::map_dend’ by ‘scrattch.hicat::map_dend’ when loading ‘scrattch.mapping’ 2: replacing previous import ‘dendextend::pvclust_show_signif_gradient’ by ‘scrattch.hicat::pvclust_show_signif_gradient’ when loading ‘scrattch.mapping’ 3: replacing previous import ‘mfishtools::resolve_cl’ by ‘scrattch.hicat::resolve_cl’ when loading ‘scrattch.mapping’ 4: In cor(as.matrix(test.dat), cl.dat) : the standard deviation is zero 5: In gzfile(file, "rb") : cannot open compressed file '\\allen/programs/celltypes/workgroups/rnaseqanalysis/shiny/Taxonomies/AIT15.2/patchseq/dend.RData', probable reason 'No such file or directory'

The two file paths referenced for "cannot open connection" and "cannot open compressed file" are both present in those locations (although the starting backslashes "\" appear to be part of the issue as they should be "//" instead. I have a feeling that this is due to something I am doing wrong or a step that I am missing. I am unable to attach the file that I am testing this with so here is the code I am using:

` library(scrattch.mapping)

setwd("//allen/programs/celltypes/workgroups/rnaseqanalysis/shiny/patch_seq/star/human/human_patchseq_MTG_20230713/")

human_query = '//allen/programs/celltypes/workgroups/rnaseqanalysis/SMARTer/STAR/Human/patchseq/R_Object/20230713_RSC-122-335_human_patchseq_star2.7_cpm.Rdata' human_MD = '//allen/programs/celltypes/workgroups/rnaseqanalysis/SMARTer/STAR/Human/patchseq/R_Object/20230713_RSC-122-335_human_patchseq_star2.7_samp.dat.Rdata'

query.counts = load(human_query) query.anno = load(human_MD) query.anno = samp.dat[match(colnames(cpmR),samp.dat$exp_component_name),] rownames(query.anno) = query.anno$exp_component_name query.logCPM = logCPM(cpmR)

refFolder = "//allen/programs/celltypes/workgroups/rnaseqanalysis/shiny/Taxonomies/AIT15.2/"

AIT.anndata = loadTaxonomy(refFolder, AI_taxonomy.h5ad) AIT.anndata = mappingMode(AIT.anndata, mode="patchseq")

query.mapping = taxonomy_mapping(AIT.anndata=AIT.anndata, query.data=query.logCPM, corr.map=TRUE, tree.map=TRUE, seurat.map=FALSE, label.cols=c("subclass_label","cluster_label","class_label") )

mapFolder = c("//allen/programs/celltypes/workgroups/rnaseqanalysis/shiny/patch_seq/star/human/human_patchseq_MTG_20230713")

mappingFolder <- paste0(mapFolder,"/mapping") buildMappingDirectory(AIT.anndata = AIT.anndata, mappingFolder = mappingFolder, query.data = query.logCPM, # Don't need log-normalized data here query.metadata = query.anno, query.mapping = query.mapping, doPatchseqQC = TRUE # Set to FALSE if not needed or if writePatchseqQCmarkers was not applied in reference generation )

ps_anno<-read_feather(paste0(mappingFolder,"/anno.feather")) ps_anno<-merge(ps_anno,samp.dat, by.x='sample_id',by.y='exp_component_name') write_feather(ps_anno, paste0(mappingFolder,"/anno2.feather"))

df<-as.data.frame(query.mapping) colnames(query.mapping)<-c('corr_score','tree_score','cor_subclass','corr_cluster','corr_class', 'tree_subclass','tree_cluster','tree_class') query.mapping$exp_component_name<-row.names(df) df<-merge(query.mapping,ps_anno,by.x ='exp_component_name', by.y='exp_component_name_label')

write_csv_arrow(df,"//allen/programs/celltypes/workgroups/rnaseqanalysis/shiny/patch_seq/star/human/human_patchseq_MTG_20230713/mapping.df.lastmap.csv") `

UCDNJJ commented 1 year ago

Hi Rusty,

This isn't anything you are doing wrong but a bug after we updated how file.paths are handled, I believe. Do you mind sharing where you are running this code? On your local machine or hpc?

@jeremymiller This issue should be related to the expanded file.path function you added: https://github.com/AllenInstitute/scrattch-mapping/blame/47b16cfa17343c8c4fcac0ae53516d4334d46218/R/utils.R#L169-L200. Do you mind taking a peak?

jeremymiller commented 1 year ago

Agreed--this function was intended to fix the "//"vs. "\" differences in windows vs. UNIX systems (not to add more issues!). I'll review once @ru57y34nn shares file where the code is running.

UCDNJJ commented 1 year ago

Perhaps we can handle this at just loadTaxonomy() and not overload all the file.path calls in R which I think is a bit dangerous.

jeremymiller commented 1 year ago

If we're going that route, why don't we just have a global parameter function called "set_file_prefix()" that defaults to "\", but could be "//" (or an entire file path or whatever)?

ru57y34nn commented 1 year ago

Hi @UCDNJJ and @jeremymiller, thanks for the quick response. The file that I am running is located here: /allen/programs/celltypes/workgroups/rnaseqanalysis/rustym/human_mtg_new_pipeline_testing/human_mapping_testing.R and I am running on HPC

UCDNJJ commented 1 year ago

Hi @ru57y34nn, when you have some time can you retry your code using this docker image:

singularity shell --cleanenv docker://njjai/scrattch_mapping:0.4

Forewarning, quite a few changes exist in this new update. So if you hit an error let us know. If everything works for you then I'll work with the bicore to update the official scrattch.mapping docker.

UCDNJJ commented 12 months ago

Hi @ru57y34nn, we fully rolled back the Windows/Linux file path handling. This should resolve the error you were originally having unless the taxonomy was originally built on Windows.

Please retry with the 0.41 docker image:

singularity shell --cleanenv docker://njjai/scrattch_mapping:0.41

@jeremymiller Will be able to rebuild the taxonomy if it was built from a windows system.