CharlesJB / ENCODExplorer

5 stars 4 forks source link

absolute files #14

Closed biterbilen closed 8 years ago

biterbilen commented 8 years ago

Using queryEncode function from ENCODExplorer_1.2.4 R package, I downloaded a table including below:

href accession /files/ENCFF002AZN/@@download/ENCFF002AZN.fastq.gz ENCSR471VHW /files/ENCFF002BBE/@@download/ENCFF002BBE.fastq.gz ENCSR859JNA /files/ENCFF002AYH/@@download/ENCFF002AYH.fastq.gz ENCSR114GLZ /files/ENCFF001ZZY/@@download/ENCFF001ZZY.fastq.gz ENCSR597UDW /files/ENCFF002AZP/@@download/ENCFF002AZP.fastq.gz ENCSR018LUP /files/ENCFF002BCH/@@download/ENCFF002BCH.fastq.gz ENCSR741STU /files/ENCFF002AZZ/@@download/ENCFF002AZZ.fastq.gz ENCSR367UDO /files/ENCFF002AZO/@@download/ENCFF002AZO.fastq.gz ENCSR179YLS

However, the files indicated in href above did not exist. Contacting the ENCODE project, I learnt that these files were replaced with new files. They sent me the following information:

Experiment Old File New File href ENCSR471VHW ENCFF002AZN ENCFF023EFI /files/ENCFF023EFI/@@download/ENCFF023EFI.fastq.gz ENCSR859JNA ENCFF002BBE ENCFF637NPE /files/ENCFF637NPE/@@download/ENCFF637NPE.fastq.gz ENCSR114GLZ ENCFF002AYH ENCFF140WER /files/ENCFF140WER/@@download/ENCFF140WER.fastq.gz ENCSR597UDW ENCFF001ZZY ENCFF606DIN /files/ENCFF606DIN/@@download/ENCFF606DIN.fastq.gz ENCSR018LUP ENCFF002AZP ENCFF195XOI /files/ENCFF195XOI/@@download/ENCFF195XOI.fastq.gz ENCSR741STU ENCFF002BCH ENCFF538SFE /files/ENCFF538SFE/@@download/ENCFF538SFE.fastq.gz ENCSR367UDO ENCFF002AZZ ENCFF699FIQ /files/ENCFF699FIQ/@@download/ENCFF699FIQ.fastq.gz ENCSR179YLS ENCFF002AZO ENCFF185IUS /files/ENCFF185IUS/@@download/ENCFF185IUS.fastq.gz

Isn't ENCODExplorer using the most recent database from ENCODE?

Thank you,

CharlesJB commented 8 years ago

Hello,

The most recent version of ENCODExplorer is 1.4.2. To install this version, you will need R-3.3.0 and Bioconductor 3.3 (https://www.bioconductor.org/install/).

CharlesJB commented 8 years ago

I've checked and the last ENCODExplorer database update was on april 4th, just before the latest Bioconductor release.

I'm reticent to push new version of the database in the release branch because the documentation mentions: Only bug fixes should be back-ported to the release branch. This is so that users of the release branch have a stable environment in which to get their work done.

I'll check with the bioc-devel mailing list to see if they are ok with new version of the database in the release branch.

Otherwise, we provide an approach to download a local version of the database in the Data update vignette.

More precisely, this part of the code:

database_filename = "new.encode.sqlite"
tables = prepare_ENCODEdb(database_filename)
new_encode_df <- export_ENCODEdb_matrix(database_filename)
new_accession_df <- export_ENCODEdb_accession(new_encode_df, database_filename)