juliandewit / kaggle_ndsb2017

Kaggle datascience bowl 2017
MIT License
624 stars 290 forks source link

where are the data being stored?? #13

Open AYadav01 opened 7 years ago

AYadav01 commented 7 years ago

Hi Julina,

Congratulation on doing such a great work. I just have few question about the directories where you stored the data. In 'setting.py', I see u are referring to following locations: BASE_DIR_SSD BASE_DIR EXTRA_DATA_DIR NDSB3_RAW_SRC_DIR LUNA16_RAW_SRC_DIR

I am kind of confused which folder contains what; where am i supposed to store the ndsb data and where to store the LUNA16 dataset.

Thank you so much.

juliandewit commented 7 years ago

It's best to take a look ate the sourcecode.

nishat-sayyed commented 6 years ago

@juliandewit still confused. Can you please explain more briefly?

bundelesneha05 commented 6 years ago

@AYadav01, I think this will help you.

BASE_DIR_SSD = "C:/werkdata/kaggle/ndsb3/" # create the folder name ndsb3 for saving the corresponding results BASE_DIR = "D:/werkdata/kaggle/ndsb3/" # create the folder name ndsb3 for placing the input data here EXTRA_DATA_DIR = "resources/" # place here extra data given by julian in his repository. NDSB3_RAW_SRC_DIR = BASE_DIR + "ndsb_raw/stage12/" # place here the kaggle data which will further LUNA16_RAW_SRC_DIR = BASE_DIR + "luna_raw/" # place here the LUNA16 database

all below directories are created for saving the corresponding results of the preprocessing and nodule detector script

NDSB3_EXTRACTED_IMAGE_DIR = BASE_DIR_SSD + "ndsb3_extracted_images/" LUNA16_EXTRACTED_IMAGE_DIR = BASE_DIR_SSD + "luna16_extracted_images/" NDSB3_NODULE_DETECTION_DIR = BASE_DIR_SSD + "ndsb3_nodule_predictions/"

nishat-sayyed commented 6 years ago

Still confused. @bundelesneha05 Please can you create these directories with all the datasets and extras (including kaggle, luna, ndsb3 etc etc) in them and upload them somewhere? And please provide the link for the same.

bundelesneha05 commented 6 years ago

It will be better you should take look at the repository(All scripts). You will able to solve your confusion.

nishat-sayyed commented 6 years ago

Thank you @bundelesneha05 I took a look through the code. As you said above, I understood most of the part. The only problem I am facing now is in the directory NDSB3_RAW_SRC_DIR. Actually, Kaggle is not providing the dataset now. If any of you has the dataset or any reference for the same then it would be more than helpful. Or if the dataset is not available anywhere, can anyone suggest how to train the model with a similar dataset? Thanks again @bundelesneha05 you helped a lot.

laisecf commented 6 years ago

I found the kaggle data here: https://github.com/smeerson/DataScienceBowl2017

juliandewit commented 5 years ago

Kaggle dropped te data. But you can still train the model.. You just cannot predict.