ModelSEED / ProbModelSEED

Other
2 stars 3 forks source link

list_models is quite slow #26

Open nconrad opened 9 years ago

nconrad commented 9 years ago

Putting a ticket in for this so it's not forgotten.

mmundy42 commented 9 years ago

Here's what list_models() is doing under the covers. First, it asks the workspace service for all of the objects in the user's root folder. Second, for each object in the root folder, it asks the workspace to recursively search for model and fba objects. The workspace service returns the metadata for all of the model and fba objects. Third, it processes the metadata for the returned objects to build the output data structure.

I'm not sure if the performance problem is with the recursive search by the workspace service or the large amount of metadata returned for model and fba objects or combination of the two.

I have 11 models in my workspace and it takes about 4.5 seconds for the server to process the request.

mmundy42 commented 9 years ago

I wrote a simple test (in Python) to simulate the requests to the workspace server done by list_models().

  1. A query for 'string' objects found 13 objects and took 1 second
  2. A query for 'model' objects found 11 objects and took 1 second
  3. A query for 'fba' objects found 32 objects and took 6 seconds
  4. A query for 'fba' and 'model' objects found 43 objects and took 6 seconds

There isn't all that much metadata for fba objects so it seems the performance issue is with the number of objects returned by the recursive search.

nconrad commented 9 years ago

When I checked the autometa in the model object metadata, it was ~100kb for one of my models with over 1000 reactions. So, this may also impact the performance?

For my 32 models, that would be some 3mb transfer of uncompressed data on the backend, I would think.

I'm seeing 4.7seconds - 7.8 seconds to list my 32 models.

screen shot 2015-06-23 at 2 43 31 pm

mmundy42 commented 9 years ago

Do you need all of that metadata for the UX? That seems like too much metadata.

nconrad commented 9 years ago

I don't need that data right there, and it doesn't make sense to me to have it there, especially if there is no way to not fetch it.

I guess this is all dependent on where model data is going to be stored... but I'd need data similar to that autometa for dropdowns (say, reaction knockouts), if we want to support those dropdowns without grabbing all model data-- I think that would be nice to have. Another potential problem there is that if the custom reaction is not in solr or somewhere else, I can't display its name in the UI without fetching the whole model. Maybe we should discuss all this on the next call.