SynBioDex / pySBOL2

A pure Python implementation of the SBOL standard.
Apache License 2.0
20 stars 6 forks source link

PartShop.pullCollection #372

Closed JoshuaUrrutia closed 4 years ago

JoshuaUrrutia commented 4 years ago

Hello, I'm trying to pull all the sequences and attachments for a particular collection, but I get a "Attribute Error" when I try to use the pullCollection function:

import sbol2
SBH_USER = 'username'
SBH_PASSWORD = 'password'
sbh = sbol2.PartShop('https://hub.sd2e.org')
sbh.login(SBH_USER, SBH_PASSWORD)
doc = sbol2.Document()
sbh.pullCollection('https://hub.sd2e.org/user/sd2e/UCSB_GBW_LandingSiteDesignv0_1through36/UCSB_GBW_LandingSiteDesignv0_1through36_collection/1', doc)
AttributeError                            Traceback (most recent call last)
<ipython-input-47-044b5ae9fcf4> in <module>
      5 sbh.login(SBH_USER, SBH_PASSWORD)
      6 doc = sbol2.Document()
----> 7 sbh.pullCollection('https://hub.sd2e.org/user/sd2e/UCSB_GBW_LandingSiteDesignv0_1through36/UCSB_GBW_LandingSiteDesignv0_1through36_collection/1', doc)

AttributeError: 'PartShop' object has no attribute 'pullCollection'

Thanks, Joshua

JoshuaUrrutia commented 4 years ago

and here's a link to the docs for that method: https://pysbol2.readthedocs.io/en/latest/API.html#sbol.libsbol.PartShop.pullCollection

tcmitchell commented 4 years ago

Unfortunately you're looking at documentation for pySBOL, not pySBOL2. The URLs would have you believe otherwise, however. The documentation for pySBOL2 is https://pysbol.readthedocs.io (pySBOL used the pysbol2 location at readthedocs, so for pySBOL2 we used pysbol. Sorry for the confusion. And please update your bookmark!)

You are correct that PartShop.pullCollection is missing in pySBOL2, and it existed in pySBOL. We could put in an alias that calls PartShop.pull, but I'm not sure of the value of having pullCollection and PartShop.pullComponentDefinition and PartShop.pullSequence. They all amount to PartShop.pull. I think the best choice is to use PartShop.pull.

But....

The URI in your example fails to pull. It gets a SynBioHub validation error. If you navigate to https://hub.sd2e.org/user/sd2e/UCSB_GBW_LandingSiteDesignv0_1through36/UCSB_GBW_LandingSiteDesignv0_1through36_collection/1/sbol and you're patient, you'll see it. I also see a SynBioHub validation error on https://hub.sd2e.org/user/sd2e/foo/foo_collection/1/sbol. Here's a working example though, using pull instead of pullCollection:

import sbol2
SBH_USER = 'username'
SBH_PASSWORD = 'password'

sbh = sbol2.PartShop('https://hub.sd2e.org')
sbh.login(SBH_USER, SBH_PASSWORD)
doc = sbol2.Document()
uri = 'https://hub.sd2e.org/user/sd2e/UCSB_GBW_LandingSiteDesignv0_1through36/UCSB_GBW_LandingSiteDesignv0_1through36_collection/1'
uri = 'https://hub.sd2e.org/user/sd2e/foo/foo_collection/1'
uri = 'https://hub.sd2e.org/user/sd2e/SalisLabCircuitDesigns/SalisLabCircuitDesigns_collection/1'
sbh.pull(uri, doc)
print(doc)
tcmitchell commented 4 years ago

Some debugging has revealed that the collection is perhaps too big to pull all at once. As such, here is how to pull in the Sequences that are in the collection using advanced search:

import sbol2
SBH_USER = 'username'
SBH_PASSWORD = 'password'

sbh = sbol2.PartShop('https://hub.sd2e.org')
sbh.login(SBH_USER, SBH_PASSWORD)
doc = sbol2.Document()
uri = 'https://hub.sd2e.org/user/sd2e/UCSB_GBW_LandingSiteDesignv0_1through36/UCSB_GBW_LandingSiteDesignv0_1through36_collection/1'

# Fetch this many at a time
step = 5
query = sbol2.SearchQuery(sbol2.SBOL_SEQUENCE, limit=step)
query[sbol2.SBOL_COLLECTION] = uri

# Find out how many there are
count = sbh.search_count_advanced(query)
print(f'There are {count} sequences to pull')

# Now fetch them a chunk at a time
for offset in range(0, count, step):
    print(f'pulling from offset {offset}')
    query.offset = offset
    result = sbh.search_advanced(query)
    identities = [item.identity for item in result]
    sbh.pull(identities, doc, recursive=False)
print(doc)
tcmitchell commented 4 years ago

We have decided not to add all of the specialized PartShop.pullType() methods since they all would simply all PartShop.pull(). If there is additional call for these methods for backward compatibility they can be added in the future.