Closed ghost closed 7 years ago
Sorry, I think we ought to move to S3, unless someone tells us how to download this file ;)
I second Pjotr's request. Even though I installed git-lfs and found the git lfs smudge
command, it didn't help - the response is '403 Forbidden'. Another advantage of S3 over Github LFS servers is fairer pricing..
We are uploading to S3. Kinda surprised - even for beta I expect better from github
I have put it onto Amazon S3. https://s3.amazonaws.com/genenetwork2/db_webqtl_small.zip
Thanks Lei. It would be good to attach a README with instructions. The procedure I used is:
1) create an empty db_webqtl_s
database from mysql console
2) copy files from the extracted db_webqtl_s
dir into /var/lib/mysql/db_webqtl_s
3) set correct permissions (for me it was chown mysql:mysql
and chmod 660
on /var/lib/mysql/db_webqtl_s/*
)
I also wish there were included a dataset with case attributes:
> select * from CaseAttributeXRef, ProbeSetFreeze
> where CaseAttributeXRef.ProbeSetFreezeId = ProbeSetFreeze.Id;
Empty set (0.04 sec)
The README can go into the GN2 tree (root level) in INSTALL.md.
Case attributes are required.
I also have a request to have at least one example dataset for each DataScale
in the test database. Currently select * from ProbeSetFreeze;
returns just two rows, and for both DataScale
is log2
.
Fixed a bug in the small database. https://s3.amazonaws.com/genenetwork2/db_webqtl_s.zip
Got it working now, and can search for traits in the dataset: Hippocampus Consortium M430v2 (Jun06)
However I do get an error when I try to run any of the different mapping tools:
Marker regression line 78
self.markers = dataset.group.get_markers()
Error: no JSON object could be decoded
Is this due to marker data being missing ?
Additionally I get errors on:
Can we add those 2 missing tables to the zip file ?
Added tables: db_webqtl_s.Docs db_webqtl_s.News
Download https://s3.amazonaws.com/genenetwork2/db_webqtl_s.zip, and then unzip it chown -R mysql:mysql db_webqtl_s/ chmod 700 db_webqtl_s/ chmod 660 db_webqtl_s/* restart MySQL service
Thanks, seems to work...
Could we add the WGCNA example dataset to the genenetwork database (and the small subset) ?
Then I can use that as a test dataset for WGCNA integration in GN2 Additionally this might be nice for future workshops, since people can then see how to use WGCNA in GN2 compared to using it in R.
The example dataset is at: http://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/FemaleLiver-Data.zip
We do however need to reformat it into GN2 structure.
Dear Danny, Lei and team,
This should be easy. That data set (and all other data sets for this cross) are already in the full GN1 database. In fact, I made corrections to this database recently (errors in sex assignment). GN1 has Phenotypes, genotypes, and four gene expression data sets (including the liver data set). The liver data set is presented as Male, Female, and Combined.
[image: Inline image 1]
Here is a piece of the CSV file with the case IDs used in the Horvath example:
Mice Number Mouse_ID Strain sex DOB parents Western_Diet Sac_Date weight_g length_cm ab_fat other_fat total_fat comments 100xfat_weight Trigly Total_Chol HDL_Chol UC FFA Glucose LDL_plus_VLDL MCP_1_phys Insulin_ug_l Glucose_Insulin Leptin_pg_ml Adiponectin Aortic lesions Note Aneurysm Aortic_cal_M Aortic_cal_L CoronaryArtery_Cal Myocardial_cal BMD_all_limbs BMD_femurs_only 1 F2_290 290 306-4 BxH ApoE-/-, F2 2 3/22/02 229232 5/14/02 9/11/02 36.9 9.9 2.53 2.26 4.79 NA 12.98102981 53 1167 50 484 121 437 1117 175.85 924 0.472943723 245462 11.274 496250 NA 16 0 17 0 0 NA NA 2 F2_291 291 307-1 BxH ApoE-/-, F2 2 3/22/02 232 5/14/02 9/11/02 48.5 10.7 2.9 2.97 5.87 NA 12.10309278 61 1230 32 592 173 572 1198 92.43 5781 0.098944819 84420.88 7.099 NA NA 16 4 0 2 4 0.0548 0.0773 3 F2_292 292 307-2 BxH ApoE-/-, F2 1 3/22/02 232 5/14/02 9/11/02 45.7 10.4 1.04 2.31 3.35 NA 7.330415755 41 1285 81 460 96 497 1204 196.398 2074 0.239633558 105889.76 5.795 218500 NA 0 0 11 0 0 0.0554 0.08065 4 F2_293 293 307-3 BxH ApoE-/-, F2 1 3/22/02 232 5/14/02 9/11/02 50.3 10.9 0.91 1.89 2.8 NA 5.566600398 271 1299 64 476 122 553 1235 97.466 11874 0.046572343 100398.68 5.495 61250 NA 0 0 0 0 236 0.0597 0.0868 5 F2_294 294 307-4 BxH ApoE-/-, F2 1 3/22/02 232 5/14/02 9/11/02 44.8 9.8 1.22 2.47 3.69 NA 8.236607143 114 1410 50 516 118 535 1360 95.452 9181 0.058272519 130846.3 6.868 243750 NA 12 10 0 0 0 NA NA 6 F2_295 295 308-1 BxH ApoE-/-, F2 1 3/22/02 232 5/14/02 9/11/02 39.2 10.2 3.06 2.49 5.55 NA 14.15816327 72 1533 18 620 106 382 1515 144.27 485 0.787628866 75166.22 17.328 104250 NA 17 2 0 0 0 0.0557 0.077
On Fri, Sep 11, 2015 at 11:55 AM, Danny Arends notifications@github.com wrote:
Thanks, seems to work...
Could we add the WGCNA example dataset to genenetwork (and the small subset) ?
Then I can use that as a test dataset for WGCNA integration in GN2 Additionally this might be nice for future workshops, since people can then see how to use WGCNA in GN2 compared to using it in R.
The example dataset is at: http://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/FemaleLiver-Data.zip
We do however need to reformat it into GN2 structure.
— Reply to this email directly or view it on GitHub https://github.com/genenetwork/genenetwork2/issues/32#issuecomment-139598808 .
Rob
Robert W. Williams, Ph.D. UT-ORNL Governor's Chair in Computational Genomics Chair, Department of Genetics, Genomics and Informatics University of Tennessee Health Science Center Room 501 855 Monroe Avenue, Memphis TN 38163 USA
Office 901 448-7018 CELL 901 604 4752 Office: 501 Wittenborg Building Department of Genetics: 71 Manassas St, Memphis TN 38163 EMAIL: rwilliams@uthsc.edu Alternative email: labwilliams@gmail.com SKYPE: robwwilliams
I have moved the test database to GNU Guix. A direct download is possible through http://files.genenetwork.org/raw_database/
@lyan6 can you document the steps you did to create this smaller database? Thanks!
Thanks!
@lyan6 can we deploy the small database on Lily?
I deployed a small GN database on Lily, and the db name is “db_webqtl_s”.
Thanks!
Done.
https://github.com/genenetwork/gndatabase/blob/master/db_webqtl_small.zip
compressed: 512MB uncompressed: 1.3GB