YeoLab / flotilla

Reproducible machine learning analysis of gene expression and alternative splicing data
http://yeolab.github.io/flotilla/docs
BSD 3-Clause "New" or "Revised" License
121 stars 26 forks source link

Unable to download hg19 annotations #313

Open hjeanc opened 9 years ago

hjeanc commented 9 years ago
2015-07-06 12:15:30 Initializing Study
2015-07-06 12:15:30 Initializing Predictor configuration manager for Study
2015-07-06 12:15:30 Predictor ExtraTreesClassifier is of type <class 'sklearn.ensemble.forest.ExtraTreesClassifier'>
2015-07-06 12:15:30 Added ExtraTreesClassifier to default predictors
2015-07-06 12:15:30 Predictor ExtraTreesRegressor is of type <class 'sklearn.ensemble.forest.ExtraTreesRegressor'>
2015-07-06 12:15:30 Added ExtraTreesRegressor to default predictors
2015-07-06 12:15:30 Predictor GradientBoostingClassifier is of type <class 'sklearn.ensemble.gradient_boosting.GradientBoostingClassifier'>
2015-07-06 12:15:30 Added GradientBoostingClassifier to default predictors
2015-07-06 12:15:30 Predictor GradientBoostingRegressor is of type <class 'sklearn.ensemble.gradient_boosting.GradientBoostingRegressor'>
2015-07-06 12:15:30 Added GradientBoostingRegressor to default predictors
2015-07-06 12:15:30 Loading metadata
2015-07-06 12:15:30 Loading species metadata from ~/flotilla_packages
Creating a directory for saving your flotilla projects: /home/hjclemons/flotilla_projects
Creating a directory for saving the data for this project: /home/hjclemons/flotilla_projects/hg19
https://s3-us-west-2.amazonaws.com/flotilla-projects/hg19/datapackage.json has not been downloaded before.
    Downloading now to /home/hjclemons/flotilla_projects/hg19/datapackage.json
https://s3-us-west-2.amazonaws.com/flotilla-projects/hg19/gencode.v19.annotation.gene.attributes.plus.json has not been downloaded before.
    Downloading now to /home/hjclemons/flotilla_projects/hg19/gencode.v19.annotation.gene.attributes.plus.json
https://s3-us-west-2.amazonaws.com/flotilla-projects/hg19/miso_metadata_gencode_v19_plus.json has not been downloaded before.
    Downloading now to /home/hjclemons/flotilla_projects/hg19/miso_metadata_gencode_v19_plus.json
https://s3-us-west-2.amazonaws.com/flotilla-projects/ercc/ERCC_Controls.txt has not been downloaded before.
    Downloading now to /home/hjclemons/flotilla_projects/hg19/ERCC_Controls.txt
No phenotype to color mapping was provided, falling back on reasonable defaults.
No phenotype to marker (matplotlib plotting symbol) was provided, so each phenotype will be plotted as a circle in visualizations.
Error loading species hg19 data: HTTP Error 404: Not Found
2015-07-06 12:15:50 Loading expression data
2015-07-06 12:15:50 Initializing expression
2015-07-06 12:15:50 Done initializing expression
2015-07-06 12:15:52 Successfully initialized a Study object!
olgabot commented 9 years ago

Everything here looks correct - what seems to be the problem?

On Mon, Jul 6, 2015, 12:18 hjeanc notifications@github.com wrote:

2015-07-06 12:15:30 Initializing Study 2015-07-06 12:15:30 Initializing Predictor configuration manager for Study 2015-07-06 12:15:30 Predictor ExtraTreesClassifier is of type <class 'sklearn.ensemble.forest.ExtraTreesClassifier'> 2015-07-06 12:15:30 Added ExtraTreesClassifier to default predictors 2015-07-06 12:15:30 Predictor ExtraTreesRegressor is of type <class 'sklearn.ensemble.forest.ExtraTreesRegressor'> 2015-07-06 12:15:30 Added ExtraTreesRegressor to default predictors 2015-07-06 12:15:30 Predictor GradientBoostingClassifier is of type <class 'sklearn.ensemble.gradient_boosting.GradientBoostingClassifier'> 2015-07-06 12:15:30 Added GradientBoostingClassifier to default predictors 2015-07-06 12:15:30 Predictor GradientBoostingRegressor is of type <class 'sklearn.ensemble.gradient_boosting.GradientBoostingRegressor'> 2015-07-06 12:15:30 Added GradientBoostingRegressor to default predictors 2015-07-06 12:15:30 Loading metadata 2015-07-06 12:15:30 Loading species metadata from ~/flotilla_packages Creating a directory for saving your flotilla projects: /home/hjclemons/flotilla_projects Creating a directory for saving the data for this project: /home/hjclemons/flotilla_projects/hg19https://s3-us-west-2.amazonaws.com/flotilla-projects/hg19/datapackage.json has not been downloaded before. Downloading now to /home/hjclemons/flotilla_projects/hg19/datapackage.jsonhttps://s3-us-west-2.amazonaws.com/flotilla-projects/hg19/gencode.v19.annotation.gene.attributes.plus.json has not been downloaded before. Downloading now to /home/hjclemons/flotilla_projects/hg19/gencode.v19.annotation.gene.attributes.plus.jsonhttps://s3-us-west-2.amazonaws.com/flotilla-projects/hg19/miso_metadata_gencode_v19_plus.json has not been downloaded before. Downloading now to /home/hjclemons/flotilla_projects/hg19/miso_metadata_gencode_v19_plus.jsonhttps://s3-us-west-2.amazonaws.com/flotilla-projects/ercc/ERCC_Controls.txt has not been downloaded before. Downloading now to /home/hjclemons/flotilla_projects/hg19/ERCC_Controls.txt No phenotype to color mapping was provided, falling back on reasonable defaults. No phenotype to marker (matplotlib plotting symbol) was provided, so each phenotype will be plotted as a circle in visualizations. Error loading species hg19 data: HTTP Error 404: Not Found 2015-07-06 12:15:50 Loading expression data 2015-07-06 12:15:50 Initializing expression 2015-07-06 12:15:50 Done initializing expression 2015-07-06 12:15:52 Successfully initialized a Study object!

— Reply to this email directly or view it on GitHub https://github.com/YeoLab/flotilla/issues/313.

hjeanc commented 9 years ago

When I ran GO the "features_of_interest_in_go_term_gene_symbols" output columns does not have symbols but just repeats the ensemble ids. Shashank thought the following error was the reason why this happened

Error loading species hg19 data: HTTP Error 404: Not Found
olgabot commented 9 years ago

can you do study.save("studyname") and then do flotilla.embark("studyname") and see if you get the same errors?