BioinformaticsFMRP / TCGAbiolinks

TCGAbiolinks
http://bioconductor.org/packages/devel/bioc/vignettes/TCGAbiolinks/inst/doc/index.html
291 stars 111 forks source link

GDCdownload : cannot create file reason 'No such file or directory' #578

Closed JRAnalytics closed 1 year ago

JRAnalytics commented 1 year ago

Hello, A new bug is observed, probably linked to the newly cleaning of GDCportal databases (link here

As for exmple : GDCquery( project = c("TCGA-PAAD") , data.category = "Gene expression", data.type = "Gene Expression Quantification", file.type = "normalized_results", experimental.strategy = "RNA-Seq", legacy = T ) isn't working anymore due to GDC cleaining.

While focussing on : GDCquery( project = c("TCGA-PAAD") , data.category = "Transcriptome Profiling", data.type = "Gene Expression Quantification", legacy = F, experimental.strategy = F, platform = F, workflow.type ="STAR - Counts", sample.type = F ) files are downloading as chunk, but a warning occurs : simpleWarning in file.create(to[okay]): cannot create file 'GDCdata/TCGA-PAAD/harmonized/Transcriptome_Profiling/Gene_Expression_Quantification/e927777e-e4e2-4dfe-af34-a0b0a6342e9a/a1111943-685b-45e8-8b97-74f91f5dd048.rna_seq.augmented_star_gene_counts.tsv', reason 'No such file or directory'>

After checking in my dir, indeed there is no such file 'a1111943-685b-45e8-8b97-74f91f5dd048.rna_seq.augmented_star_gene_counts.tsv' But, it is present in the Manifest for downloading with the GDCdownload.aux() function. I also checked in the GDCdata portal UI, and this files exist inte the TCGA-PAAD project.

I can't find where the probleme is in the different fonctions of the package.

Tried with other TCGA project, same results. I may be the only one.

I hope it is as clear as possible to help you fixe the issue.

Sincerly,

tiagochst commented 1 year ago

For the first query, we removed the support for legacy database since the access to the legacy archive using the GDC API is not available anymore.

For the second query, what is the OS (linux, mac, windows) ?

JRAnalytics commented 1 year ago

Thanks for the update.

I'am working on Windows.

tiagochst commented 1 year ago

Windows has a 256 character PATH limit: link.

"GDCdata/TCGA-PAAD/harmonized/Transcriptome_Profiling/Gene_Expression_Quantification/e927777e-e4e2-4dfe-af34-a0b0a6342e9a/a1111943-685b-45e8-8b97-74f91f5dd048.rna_seq.augmented_star_gene_counts.tsv" has already 196 characters. Probably you can check the length of the complete path.

For the package, since legacy was archived, and now there is access to GDC API version, I need to check if I can change the folder structure to reduce the length of the path. Probably I will not be able to it soon.

JRAnalytics commented 1 year ago

I tried changing my working.directory, closer to the root, and it works.

Indeed it is a problem of character PATH limit. Surprising!!!!

Thanks a lot!

tiagochst commented 1 year ago

You are welcome!