ewels / Labrador

A web based tool to manage and automate the processing of publicly available datasets.
https://www.bioinformatics.babraham.ac.uk/projects/labrador/
GNU General Public License v3.0
37 stars 9 forks source link

Uploading many datasets #2

Open AuroraSavino opened 9 years ago

AuroraSavino commented 9 years ago

When adding datasets to Usoskin_2015 project nothing gets uploaded. What I do is accessing the project, go to "Dataset" and click "Add datasets". I look up accession GSE59739 and many datasets appear. I give them all a cell type and then I click on Save all datasets. Since they are many, a window pops up asking if I am sure and I click on OK. Afterwards the screen shows the "Dataset" page without any change, so none of the datasets has been imported. The same happens in other cases, but the datasets are imported if only a subset of them (some tens at most) is kept.

ewels commented 9 years ago

Thanks @AuroraSavino! If you're using Google Chrome you should see an option in the right-click menu saying "Inspect Element". If you select this you'll get a web developer panel with tabs, one of which is called Console.

Does anything come up in this Console when you try to import the datasets?

AuroraSavino commented 9 years ago

The only line that appears is:

Navigated to http://bilin1/labrador/datasets.php?id=254
ewels commented 9 years ago

Hi @AuroraSavino - I've managed to get Labrador running on my machine and was able to replicate the same problem you've been having.

I'm not sure exactly what is causing the problem, but the sheer number of datasets seems to be crashing the browser. There's no simple update except for adding the datasets in batches.

In the short term, you should be able to get things to work by manually deleting a load of the datasets before clicking Save. Then repeat, but delete a different bunch to import the next set. This will be really boring and involve lots of clicking, but it should at least work so that @FelixKrueger can start processing your data.

In the long term, I need to either add some kind of pagination for adding datasets, or add a special button to automatically add all datasets found on the SRA. Or both.

FelixKrueger commented 9 years ago

Hi Phil,

This sometimes works and sometimes it doesn't. We do by now have quite a few projects that fail in one way or another, there also doesn't seem to be a certain number of datasets that works, so the quite boring clicking process might have to be carried out for a whole afternoon to find what works and what doesn't. There are also some single datasets that seem to be troublemakers, and as soon as they are included in the download list they will cause the entire process to fail. I have tracked down 1 sample from the latest Farlik_2015 data that won't import under any condition. As you can imagine it is very fiddly if you've got say 5 culprits in a list of 1500 files that will cause everything to crash :)

On another note, Aurora has also managed to add a load of datasets to some projects (e.g. Tang_2014) that added OK but if you select Clusterflow download nothing happens...

Anyway, much appreciated that you have started working full time on this again! :)

ewels commented 9 years ago

Nice, well you know the process - keep reporting them and we can see if there are any patterns.. Add the other Cluster Flow download thing as a separate issue if that's ok.

On a general point - looking at the code is making me quite sad. I just went through doing a load of basic updates just so that I could run it on my machine. I'm inclined to entirely remove the processing section of the site as this is kind of redudant now and adds quite a lot of complexity. If this was replaced with a download button that gave the cluster flow input file would you be happy?

FelixKrueger commented 9 years ago

I would be happy with that, yes. Haven’t used any of the processing steps other than the CF download file ever since you left… Not sure if Simon has another opinion on that.

ewels commented 9 years ago

I'll move this onto a new issue thread (#4) to try to keep things organised.

jcgrenier commented 7 years ago

Hi @ewels, Were you able to find a solution to that importing issue? I was trying to import a dataset when I realized that I saw this thread a while back ago.

Thanks!

JC

jcgrenier commented 7 years ago

Hi @ewels and @FelixKrueger,

Just a quick update here. I was able to add up a lot of data in my implementation of Labrador.

http://stackoverflow.com/questions/10303714/php-max-input-vars

You just need to add up some lines in you php.ini file located in the php parameter files (this path for me : /etc/php/7.0/apache2/php.ini)

php_value max_input_vars 10000
php_value suhosin.get.max_vars 10000
php_value suhosin.post.max_vars 10000
php_value suhosin.request.max_vars 10000

In my case, I had about 1000 samples to put there. I just put a higher values from what the thread is actually suggesting.

Hopes it will help some that are using the tool! Best, JC

FelixKrueger commented 7 years ago

This sounds like a neat little fix, thanks for the update!

ewels commented 7 years ago

Aha, brilliant stuff - thanks @jcgrenier!

s-andrews commented 5 years ago

Just for the record - in our php.ini file the correct syntax for this was just:

max_input_vars = 100000

..none of the other variables needed to be changed.