kbaseattic / assembly

An extensible framework for genome assembly.
MIT License
12 stars 14 forks source link

Ensure index on user and jobid #300

Closed cbun closed 9 years ago

cbun commented 9 years ago

This should do the trick. You only need to ensure index once, but it is idempotent so there's no harm in calling it on every init.

My dev mongo shows that it worked:

> db.jobs.getIndexes()
[
        {
                "v" : 1,
                "key" : {
                        "_id" : 1
                },
                "ns" : "arast.jobs",
                "name" : "_id_"
        },
        {
                "v" : 1,
                "key" : {
                        "ARASTUSER" : 1,
                        "job_id" : 1
                },
                "ns" : "arast.jobs",
                "name" : "ARASTUSER_1_job_id_1"
        }
]
cbun commented 9 years ago

299

cbun commented 9 years ago

@sebhtml I see. To be clear, you said

In the following statement, the find on the "ARASTUSER" key will work because it is a prefix of the compound index. And the sort() on the 'job_id' will also work because the preceding field "ARASTUSER" is subject to an equality.

db.jobs.find({'ARASTUSER':user}).sort('job_id', 1)

This will only work properly with two separate indexes? Or it will work with the compound index?

sebhtml commented 9 years ago

It will work.

cbun commented 9 years ago

Okay great. The fix is addressing this line:

j in jobs.find({'ARASTUSER':user}).sort('job_id', 1):

https://github.com/kbase/assembly/blob/39caa691c0d9a470abc3339c53b0bd9ddc65459d/lib/assembly/metadata.py#L111

levinas commented 9 years ago

What’s the cost of ensureIndex()? I suppose it’s only called once whenever the server launches?

On Feb 24, 2015, at 6:03 PM, Christopher Bun notifications@github.com wrote:

Okay great. The fix is addressing this line:

j in jobs.find({'ARASTUSER':user}).sort('job_id', 1): https://github.com/kbase/assembly/blob/39caa691c0d9a470abc3339c53b0bd9ddc65459d/lib/assembly/metadata.py#L111 https://github.com/kbase/assembly/blob/39caa691c0d9a470abc3339c53b0bd9ddc65459d/lib/assembly/metadata.py#L111 — Reply to this email directly or view it on GitHub https://github.com/kbase/assembly/pull/300#issuecomment-75867700.

sebhtml commented 9 years ago

As Chris said, it is idempotent. Its cost is quite small because it uses a cache to check it the index already exists.

On the other hand, create_index (in pymongo) does not check if the index already exists.

Chris can tell you whether the constructor MetadataConnection() is called only on startup.