LLNL / lbann

Livermore Big Artificial Neural Network Toolkit
http://software.llnl.gov/lbann/
Other
224 stars 79 forks source link

Regarding Candle/Pilot1 dataset #933

Open Raviteja1996 opened 5 years ago

Raviteja1996 commented 5 years ago

Hi, from where can I download the Candle/Pilot1 dataset. I went to "ftp.mcs.anl.gov" but I was able to find the file name P1B3_test_drugs.txt which is mentioned in datareader_candle_pilot.prototext. But for the other file names which are mentioned in other data_reader.prototext files I was not able to find. So can I know from where can I download those datasets for Pilot1.

Raviteja1996 commented 5 years ago

Can I use the pilot2 dataset for pilot1 models? I am asking because I found the pilot2 proper dataset with training and testing but not for pilot1. I want to make sure whether I could do that.

Raviteja1996 commented 5 years ago

I was not able to find the Pilot1 dataset online, is there any place in particular to search for that.

davidHysom commented 5 years ago

A bit of googling ( github candle pilot ) led me to: https://github.com/ECP-CANDLE/Benchmarks/tree/master/Pilot1/P1B1 [https://avatars0.githubusercontent.com/u/22963744?s=400&v=4]https://github.com/ECP-CANDLE/Benchmarks/tree/master/Pilot1/P1B1

Benchmarks/Pilot1/P1B1 at master · ECP-CANDLE/Benchmarks · GitHubhttps://github.com/ECP-CANDLE/Benchmarks/tree/master/Pilot1/P1B1 P1B1: Autoencoder Compressed Representation for Gene Expression. Overview: Given a sample of gene expression data, build a sparse autoencoder that can compress the expression profile into a low-dimensional vector.. Relationship to core problem: Many molecular assays generate large numbers of features that can lead to time-consuming processing and over-fitting in learning tasks; hence, a core ... github.com


From: Raviteja1996 notifications@github.com Sent: Tuesday, March 19, 2019 2:48 AM To: LLNL/lbann Cc: Subscribed Subject: Re: [LLNL/lbann] Regarding Candle/Pilot1 dataset (#933)

I was not able to find the Pilot1 dataset online, is there any place in particular to search for that.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/LLNL/lbann/issues/933#issuecomment-474272158, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AI8DHzLl13lrE3YqEmc1-SgQ8XHP21r2ks5vYLJXgaJpZM4bthdS.

Raviteja1996 commented 5 years ago

Hi, I was not able to use them while running autoencoder pilot1 model using lbann. Because maybe its using p1b3 dataset and the link which you specified was p1b1. I tried to run with p1b1, by downloading the train and test files and addressing them in the data_reader file. Even when using the P1B3_test_drugs.txt I was getting error showing invalid memory reference. Are the datasets for P1B3 train and test removed? Is there any other way to run the pilot1 models in lbann?

bvanessen commented 5 years ago

I will talk with the pilot 1 team to see if the P1B3 data set is publicly available.

Brian C. Van Essen

vanessen1@llnl.govmailto:vanessen1@llnl.gov

(w) 925-422-9300

(c) 925-290-5470


Sent from my iPhone

On Mar 20, 2019, at 3:05 AM, Raviteja1996 notifications@github.com<mailto:notifications@github.com> wrote:

Hi, I was not able to use them while running autoencoder pilot1 model using lbann. Because maybe its using p1b3 dataset and the link which you specified was p1b1. I tried to run with p1b1, by downloading the train and test files and addressing them in the data_reader file. Even when using the P1B3_test_drugs.txt I was getting error showing invalid memory reference. Are the datasets for P1B3 train and test removed? Is there any other way to run the pilot1 models in lbann?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/LLNL/lbann/issues/933#issuecomment-474764542, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AF7Ce8RNiqHNMSNP-19TYkAVbO6p0xSCks5vYgfIgaJpZM4bthdS.