Closed JunzeChen closed 4 years ago
Hi @JunzeChen Unfortunately, this is a problem with Windows operating systems. There is a character limit for path on Windows and that makes the problem.
Has there been any further activity on this? Corporate standards constrain me to using a Windows system and my working directory is c:/temp, which is the shortest I am allowed. The name of the file I am trying to download from the Broad Institute is "gdac.broadinstitute.org_LAML.Merge_methylationhumanmethylation27jhu_usc_edu__Level_3within_bioassay_data_set_functiondata.Level_3.2016012800.0.0.tar.gz" which is triggering the following error:
Content type 'application/x-gzip' length 49624216 bytes (47.3 MB) downloaded 47.3 MB
Error in dirname(name) : path too long
The download appears to have worked based on the messages, so I am wondering if there is something that can be coded into an updated version of the RTCGAToolbox package that can somehow shorten the name of the file after the download to fit the Windows limit before further processing it. As a thought, maybe retain the original name in an object for verification purposes and/or write it out in the messages for tracking in the even that multiple files are downloaded.
Hi @dmbergau, Yes, this is a known issue for Windows users, see #19 and #22. I can have a look at it but I can't guarantee a fix. If you'd like to take a stab at it, I'll be happy to review a pull request.
Best regards, Marcel
Thank you Marcel for your quick response.
I am a physiologist and bit of a newbie programmer and to Github so the specifics and syntax of this are well above my programming capabilities, but in terms of "R-ish" pseudocode inspired by #19 (hopefully) without having to use external software like 7-zip, here is what I am thinking:
maxchar <- 200 # or whatever the Windows path character limit is append <- 10 # however much room you want to leave at the end to append an increment
if (length(fileList) == 1 & nchar(fileList) > maxchar){
tmp1 <- fileList # not sure of correct syntax
shorter <- fileList[maxchar-append] # probably wrong syntax, maybe substring?
if (shorter does not already exist){ shortenedName <- paste0(shorter, "_1") file.rename(tmp1, shortenedName) increment <- increment + 1 }
else if (shorter already exists){
# add 1 to it
shortenedName <- paste0(shorter, "_", increment)
file.rename(tmp1, shortenedName)
increment <- increment + 1
}
}
There may be more to it programmatically, but this is the best I can do.
Best regards, Dennis
You might also want to try curatedTCGAData, which further processes data from RTCGAToolbox to provide SummarizedExperiment and RaggedExperiment objects within MultiAssayExperiments, so that all assays are linked to patient data and to each other. It doesn't have the same issue on Windows.
Thank you Levi, I will give that a try and let you know.
Warm regards, Dennis
Dennis M. Bergau, MA, PHDmailto:dennis.bergau@abbvie.com | Sr. Research Pharmacologist, Cardiac Safety Clinical Systems | Clinical Pharamcology & Pharmacometrics 480 South US Rt. 45 | Grayslake, IL 60030 USA OFFICE +1 847-936-3669 | EMAIL dennis.bergau@abbvie.commailto:dennis.bergau@abbvie.com
abbvie.comhttp://www.abbvie.com This communication may contain information that is proprietary, confidential, or exempt from disclosure. If you are not the intended recipient, please note that any other dissemination, distribution, use or copying of this communication is strictly prohibited. Anyone who receives this message in error should notify the sender immediately by telephone or by return e-mail and delete it from his or her computer.
From: Levi Waldron [mailto:notifications@github.com] Sent: Sunday, June 24, 2018 5:17 AM To: mksamur/RTCGAToolbox Cc: Bergau, Dennis M; Mention Subject: [EXTERNAL] Re: [mksamur/RTCGAToolbox] getFirehoseData error: Error in dirname(name) : path too long (#21)
You might also want to try curatedTCGADatahttp://www.bioconductor.org/packages/curatedTCGAData/, which further processes data from RTCGAToolbox to provide SummarizedExperiment and RaggedExperiment objects within MultiAssayExperiments, so that all assays are linked to patient data and to each other. It doesn't have the same issue on Windows.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/mksamur/RTCGAToolbox/issues/21#issuecomment-399745371, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AmiHqWVp5DIypUllNbH8-C0FrpqVAQjNks5t_2c1gaJpZM4NGgUe.
I've added a warning see #29. There is not much that can be done (in R) since it is more of an OS issue. Best, Marcel
Dear mksamur/RTCGAToolbox: @mksamur RTCGAToolbox is a powerful and useful package, however, when I use Data <- getFirehoseData(dataset="LUAD", runDate="20160128", Clinic=TRUE, RNAseq_Gene=TRUE, mRNA_Array=TRUE, Mutation=TRUE) to download data, there was error like: Error in dirname(name) : path too long
Then how to solve this problem.
Many thanks! Junze Chen
Matrix products: default
locale: [1] LC_COLLATE=Chinese (Simplified)_China.936 [2] LC_CTYPE=Chinese (Simplified)_China.936
[3] LC_MONETARY=Chinese (Simplified)_China.936 [4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_China.936
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] RTCGAToolbox_2.5.2
loaded via a namespace (and not attached): [1] Rcpp_0.12.10 lattice_0.20-34 XML_3.98-1.6 bitops_1.0-6
[5] grid_3.4.0 plyr_1.8.4 gtable_0.2.0 scales_0.4.1
[9] ggplot2_2.2.1 lazyeval_0.2.0 data.table_1.10.4 RCircos_1.2.0
[13] limma_3.31.21 Matrix_1.2-8 cowplot_0.7.0 splines_3.4.0
[17] RJSONIO_1.3-0 tools_3.4.0 RCurl_1.95-4.8 munsell_0.4.3
[21] survival_2.41-2 compiler_3.4.0 colorspace_1.3-2 tibble_1.3.0
The code is: `> library(RTCGAToolbox)