mlpack / models

models built with mlpack
https://models.mlpack.org/docs
BSD 3-Clause "New" or "Revised" License
35 stars 41 forks source link

run models #14

Closed sahooora closed 3 years ago

sahooora commented 4 years ago

How can we run the models? I want to run vae model but It's not specified how!

kartikdutt18 commented 4 years ago

Hi @sara-hoseininasab, This repo is undergoing restructuring right now, so that portion was edited in the readme. Sorry about that. We do have a completely working repository here. It has a make file and scripts to download datasets and run models as well. Here is the VAE example.

sahooora commented 4 years ago

Hi @sara-hoseininasab, This repo is undergoing restructuring right now, so that portion was edited in the readme. Sorry about that. We do have a completely working repository here. It has a make file and scripts to download datasets and run models as well. Here is the VAE example.

Thanks! After running mnist_vae_cnn I got this error: [FATAL] Cannot open file './../data/mnist_full.csv'. I should mention that before running it I executed download_dataset.py in the tools directory.

kartikdutt18 commented 4 years ago

Can you check that outside vae directory there is a data folder with mnist_full.csv? Regards.

sahooora commented 4 years ago

Can you check that outside vae directory there is a data folder with mnist_full.csv? Regards.

There doesn't exist mnist_full.csv in the data directory. Just mnist_test.csv and mnist_train.csv are included in this directory.

kartikdutt18 commented 4 years ago

Hmm, Let me try replicating the issue.

sahooora commented 4 years ago

Hmm, Let me try replicating the issue.

Thanks!

shrit commented 4 years ago

@kartikdutt18 I do not remember what there was in mnist_full.csv. I can understand from the name that it merges both of the train and test files. @sara-hoseininasab You can replace mnist_full.csv by mnist_train.csv or merge manually both file and create a mnist_full.csv

kartikdutt18 commented 4 years ago

Right, I am facing electricity outage here, Most probably its mnist_train or mnist_all. Once I'm back, I'll try to find out what was there in the file.

shrit commented 4 years ago

@kartikdutt18 I will handle it, no worries.

kartikdutt18 commented 4 years ago

@shrit, Awesome! Thanks a lot

shrit commented 4 years ago

@kartikdutt18 I open a pull request here https://github.com/mlpack/examples/pull/92 Still, during the execution, I had this issue: https://github.com/mlpack/examples/issues/84

kartikdutt18 commented 4 years ago

Hmm, Let me look into that now.

kartikdutt18 commented 4 years ago

Hey @shrit, I just downloaded all the datasets. I also got mnist_full.csv in my datafolder.

shrit commented 4 years ago

Hmm, are you sure this is not an old one?

kartikdutt18 commented 4 years ago

Yeah looks that way. Sorry about that.

kartikdutt18 commented 4 years ago

Still, during the execution, I had this issue: mlpack/examples#84

I am able to fix this issue. Another thing is mnist_train has one extra labels column. We also need to drop that.

shrit commented 4 years ago

@kartikdutt18 You meant the first column as parsed as in this line https://github.com/mlpack/examples/blob/fbce4998c09abe433c42f2e1519df1b9c94aba5f/mnist_simple/mnist_simple.cpp#L76 Or there is another one?

kartikdutt18 commented 4 years ago

Right this one, Labels are not needed vae so maybe mnist_full was the csv with train without labels.

sahooora commented 4 years ago

@kartikdutt18 I do not remember what there was in mnist_full.csv. I can understand from the name that it merges both of the train and test files. @sara-hoseininasab You can replace mnist_full.csv by mnist_train.csv or merge manually both file and create a mnist_full.csv

@shrit Thanks. I replaced mnist_train.csv by mnist_ful.csv and I got this error: [FATAL] The output width / output height is not possible given the other parameters of the layer.

terminate called after throwing an instance of 'std::runtime_error'

shrit commented 4 years ago

@kartikdutt18 Exactly, in this case, I will remove it using Armadillo from inside mnist_vae. @sara-hoseininasab we are working on this issue, you need to remove the first columns of this dataset using armadillo.

kartikdutt18 commented 4 years ago

@sara-hoseininasab, The padding needs to be valid in transposed layer. Earlier I think boost visitors took care of that but now we need to specify it manually. I'll open a PR shortly to fix that.

sahooora commented 4 years ago

armadillo

@kartikdutt18 Exactly, in this case, I will remove it using Armadillo from inside mnist_vae. @sara-hoseininasab we are working on this issue, you need to remove the first columns of this dataset using armadillo.

@shrit I am completely new in this scope. Could you please tell me how can I remove the first columns of the dataset using armadillo? regards

kartikdutt18 commented 4 years ago

I think you can add the following line :

fullData = fullData.rows(1, fullData.n_rows - 1);

at line 65.

sahooora commented 4 years ago

I think you can add the following line :

fullData = fullData.rows(1, fullData.n_rows - 1);

at line 65.

It should be added in mnist_vae_cnn.cpp file, am I right? I did it but I still get the same error.

kartikdutt18 commented 4 years ago

Right, To fix that error you need to make the following change :

// Add the first transposed convolution(deconvolution) layer.
    decoder->Add<TransposedConvolution<>>(
        24,  // Number of input activation maps.
        16,  // Number of output activation maps.
        5,   // Filter width.
        5,   // Filter height.
        1,   // Stride along width.
        1,   // Stride along height.
        0,   // Padding width.
        0,   // Padding height.
        10,  // Input width.
        10,  // Input height.
        14,  // Output width.
        14,
        "valid"); // Output height.

    decoder->Add<LeakyReLU<>>();
    decoder->Add<TransposedConvolution<>>(16, 1, 15, 15, 1, 1, 1, 1,
        14, 14, 28, 28,"valid");

Replace the similar layers with this one.

sahooora commented 4 years ago

Right, To fix that error you need to make the following change :

// Add the first transposed convolution(deconvolution) layer.
    decoder->Add<TransposedConvolution<>>(
        24,  // Number of input activation maps.
        16,  // Number of output activation maps.
        5,   // Filter width.
        5,   // Filter height.
        1,   // Stride along width.
        1,   // Stride along height.
        0,   // Padding width.
        0,   // Padding height.
        10,  // Input width.
        10,  // Input height.
        14,  // Output width.
        14,
        "valid"); // Output height.

    decoder->Add<LeakyReLU<>>();
    decoder->Add<TransposedConvolution<>>(16, 1, 15, 15, 1, 1, 1, 1,
        14, 14, 28, 28,"valid");

Replace the similar layers with this one.

@kartikdutt18 Thank, the previous error has been solved. But I get the new one: error: subtraction: incompatible matrix dimensions: 784x28 and 59999x28 terminate called after throwing an instance of 'std::logic_error' what(): subtraction: incompatible matrix dimensions: 784x28 and 59999x28 Aborted (core dumped)

shrit commented 4 years ago

784x28 is the size of the older mnist_full matrix. @kartikdutt18 I do not know how VAE works in detail, but the neural models inside seem to be tailored to the old dataset.

kartikdutt18 commented 4 years ago

Right using mnist_full.csv doesn't give this error.

sahooora commented 4 years ago

Right using mnist_full.csv doesn't give this error.

As you said I renamed mnist_train.csv to mnist_full.csv and then run the model.

kartikdutt18 commented 4 years ago

The valid part of the error is fixed in mlpack/mlpack#2436.

kartikdutt18 commented 4 years ago

Could you download the dataset from here. Thanks.

kartikdutt18 commented 4 years ago

784x28 is the size of the older mnist_full matrix. @kartikdutt18 I do not know how VAE works in detail, but the neural models inside seem to be tailored to the old dataset.

I'll try to figure that out too.

shrit commented 4 years ago

@kartikdutt18 Actually it is the input size 28 * 28 = 784 It seems that the entire dataset treated as an image

kartikdutt18 commented 4 years ago

Right. VAE model takes in an image and outputs an image.

sahooora commented 4 years ago

Could you download the dataset from here. Thanks.

@kartikdutt18 I downloaded it. with the new mnist_full.csv I got the below error: error: subtraction: incompatible matrix dimensions: 784x50 and 783x50 terminate called after throwing an instance of 'std::logic_error' what(): subtraction: incompatible matrix dimensions: 784x50 and 783x50

kartikdutt18 commented 4 years ago

Ahh, You don't have to remove columns with this one. Just make the change in model only.

sahooora commented 4 years ago

Ahh, You don't have to remove columns with this one. Just make the change in model only.

@kartikdutt18 Thanks. I fixed it and run again. I encountered with core dumped: Training ... Initial loss -> 726.155 terminate called after throwing an instance of 'std::bad_array_new_length' what(): std::bad_array_new_length Aborted (core dumped)

Thanks

kartikdutt18 commented 4 years ago

I'll look into that tomorrow. Sorry about the mess though, we will update the ReadMe and get all examples running. Thanks a lot for being so patient.

sahooora commented 4 years ago

I'll look into that tomorrow. Sorry about the mess though, we will update the ReadMe and get all examples running. Thanks a lot for being so patient.

Ok, thank you so much.

kartikdutt18 commented 4 years ago

Hey @sara-hoseininasab, There are couple of more things that would need to be fixed. I'll create a PR today with simple VAE (non cnn type). You can use that till the issue is resolved.

sahooora commented 4 years ago

Hey @sara-hoseininasab, There are couple of more things that would need to be fixed. I'll create a PR today with simple VAE (non cnn type). You can use that till the issue is resolved.

Hi, @kartikdutt18 Thank you so much. I'm waiting. Please let me know when it's ready. Regards

kartikdutt18 commented 4 years ago

I created a branch here that has normal vae but it still segfaults. I don't know why. I tried rest of the examples again and they work fine. The problem lies with vae and that I think requires a bit of debugging.

sahooora commented 4 years ago

I created a branch here that has normal vae but it still segfaults. I don't know why. I tried rest of the examples again and they work fine. The problem lies with vae and that I think requires a bit of debugging.

@kartikdutt18 Please let me know when the bug is fixed. Regards

sahooora commented 4 years ago

@kartikdutt18 Any news about the vae?

kartikdutt18 commented 4 years ago

In the source of mlpack, the parameter error is fixed. The only error left is segmentation fault. I'll try to find why that happens in the next couple of days.

mlpack-bot[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! :+1:

shrit commented 4 years ago

keep open

mlpack-bot[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! :+1:

rcurtin commented 4 years ago

Just wanted to check in @shrit and @kartikdutt18, do you think that we have resolved this issue? Or is there work we still need to do to be able to run these models?

kartikdutt18 commented 4 years ago

Hey @rcurtin, There is partial fix made in mlpack/mlpack#2436. However, The VAE Model still segfaults. Since this has more to do with the examples repo, we also have an issue here.