microsoft / azure-devops-python-api

Azure DevOps Python API
https://docs.microsoft.com/azure/devops/integrate/index?view=azure-devops
MIT License
574 stars 197 forks source link

[Showstopper] Cannot use continuation token in API 7.0 and above (thus limiting number of results) #461

Open jfthuong opened 1 year ago

jfthuong commented 1 year ago

Hi, in earlier versions of the API (until 6.0.0b4), when making a request on some items (e.g. WorkItems, Test Suites, ...), you had a response object with a value and a continuation_token that you could use to make a new request and continue parsing.

For example, here is the prototype of such function:

    def get_test_suites_for_plan(self, project, plan_id, expand=None, continuation_token=None, as_tree_view=None):
        """GetTestSuitesForPlan.
        [Preview API] Get test suites for plan.
        :param str project: Project ID or project name
        :param int plan_id: ID of the test plan for which suites are requested.
        :param str expand: Include the children suites and testers details.
        :param str continuation_token: If the list of suites returned is not complete, a continuation token to query next batch of suites is included in the response header as "x-ms-continuationtoken". Omit this parameter to get the first batch of test suites.
        :param bool as_tree_view: If the suites returned should be in a tree structure.
        :rtype: :class:`<GetTestSuitesForPlanResponseValue>`

So you could do something like:

resp = client.get_test_suites_for_plan(project, my_plan_id)
suites = resp.value
while resp.continuation_token:
    resp = client.get_test_suites_for_plan(project, my_plan_id)
    suites += resp.value

With more recent versions (in particular 7.0), you now get a list returned (but with the limit of size imposed by the API).

For example, a version of similar function would be:

    def get_test_suites_for_plan(self, project, plan_id, expand=None, continuation_token=None, as_tree_view=None):
        """GetTestSuitesForPlan.
        [Preview API] Get test suites for plan.
        :param str project: Project ID or project name
        :param int plan_id: ID of the test plan for which suites are requested.
        :param str expand: Include the children suites and testers details.
        :param str continuation_token: If the list of suites returned is not complete, a continuation token to query next batch of suites is included in the response header as "x-ms-continuationtoken". Omit this parameter to get the first batch of test suites.
        :param bool as_tree_view: If the suites returned should be in a tree structure.
        :rtype: :class:`<[TestSuite]> <azure.devops.v6_0.test_plan.models.[TestSuite]>`
        """

How to retrieve the continuation token to continue parsing the other results?

jfthuong commented 1 year ago

An idea of how to proceed would be to patch the _send method of the base Client (in client.py) by storing the continuation token of the last request.

Something like:

class Client(object):
    """Client.
    :param str base_url: Service URL
    :param Authentication creds: Authenticated credentials.
    """

    def __init__(self, base_url=None, creds=None):
        ...
        self.continuation_token_last_request = None

    def _send(self, http_method, location_id, version, route_values=None,
              query_parameters=None, content=None, media_type='application/json', accept_media_type='application/json',
              additional_headers=None):
        ...
        response = self._send_request(request=request, headers=headers, content=content, media_type=media_type)
        ...
        # Patch: Workaround to be able to see the continuation token of the response
        self.continuation_token_last_request = self._get_continuation_token(response)

        return response

And we could use as such:

>>> suite_plan_id = 68185
>>> suites = test_plan_client.get_test_suites_for_plan(project, suite_plan_id)
>>> len(suites), test_plan_client.continuation_token_last_request
(200, '339901;0')
>>> while test_plan_client.continuation_token_last_request is not None:
...     suites += test_plan_client.get_test_suites_for_plan(project, suite_plan_id, continuation_token=test_plan_client.continuation_token_last_request)
>>> len(suites), test_plan_client.continuation_token_last_request
(214, None)

Any better way or idea?

Let's note that there is already a method Client._get_continuation_token but it does not seem to be used in the current version of the API.

jfthuong commented 1 year ago

FYI... I have done a function to temporarily patch as I described above:

"""Patching ADO Client to retrieve continuation token

Related to question in following issue:
https://github.com/microsoft/azure-devops-python-api/issues/461
"""
import logging
from typing import Optional, cast

from azure.devops import _models
from azure.devops.client import Client
from azure.devops.client_configuration import ClientConfiguration
from msrest import Deserializer, Serializer
from msrest.service_client import ServiceClient

logger = logging.getLogger("azure.devops.client")

# pylint: disable=super-init-not-called

class ClientPatch(Client):
    """Client.
    :param str base_url: Service URL
    :param Authentication creds: Authenticated credentials.
    """

    def __init__(self, base_url=None, creds=None):
        self.config = ClientConfiguration(base_url)
        self.config.credentials = creds
        self._client = ServiceClient(creds, config=self.config)
        _base_client_models = {
            k: v for k, v in _models.__dict__.items() if isinstance(v, type)
        }
        self._base_deserialize = Deserializer(_base_client_models)
        self._base_serialize = Serializer(_base_client_models)
        self._all_host_types_locations = {}
        self._locations = {}
        self._suppress_fedauth_redirect = True
        self._force_msa_pass_through = True
        self.normalized_url = Client._normalize_url(base_url)
        self.continuation_token_last_request: Optional[str] = None

    def _send(
        self,
        http_method,
        location_id,
        version,
        route_values=None,
        query_parameters=None,
        content=None,
        media_type="application/json",
        accept_media_type="application/json",
        additional_headers=None,
    ):
        request = self._create_request_message(
            http_method=http_method,
            location_id=location_id,
            route_values=route_values,
            query_parameters=query_parameters,
        )
        negotiated_version = self._negotiate_request_version(
            self._get_resource_location(self.normalized_url, location_id), version
        )
        negotiated_version = cast(str, negotiated_version)

        if version != negotiated_version:
            logger.info(
                "Negotiated api version from '%s' down to '%s'."
                " This means the client is newer than the server.",
                version,
                negotiated_version,
            )
        else:
            logger.debug("Api version '%s'", negotiated_version)

        # Construct headers
        headers = {
            "Content-Type": media_type + "; charset=utf-8",
            "Accept": accept_media_type + ";api-version=" + negotiated_version,
        }
        if additional_headers is not None:
            for key in additional_headers:
                headers[key] = str(additional_headers[key])
        if self.config.additional_headers is not None:
            for key in self.config.additional_headers:
                headers[key] = self.config.additional_headers[key]
        if self._suppress_fedauth_redirect:
            headers["X-TFS-FedAuthRedirect"] = "Suppress"
        if self._force_msa_pass_through:
            headers["X-VSS-ForceMsaPassThrough"] = "true"
        if (
            Client._session_header_key in Client._session_data
            and Client._session_header_key not in headers
        ):
            headers[Client._session_header_key] = Client._session_data[
                Client._session_header_key
            ]
        response = self._send_request(
            request=request, headers=headers, content=content, media_type=media_type
        )
        if Client._session_header_key in response.headers:
            Client._session_data[Client._session_header_key] = response.headers[
                Client._session_header_key
            ]

        # Patch: Workaround to be able to see the continuation token of the response
        self.continuation_token_last_request = self._get_continuation_token(response)

        return response

def patch_azure_devops_client():
    """Patch the Azure DevOps client to see the continuation token of the response"""
    # pylint: disable=protected-access
    Client.__init__ = ClientPatch.__init__  # type: ignore
    Client._send = ClientPatch._send  # type: ignore

Just importing and calling the patch_azure_devops_client before creating a client will add a continuation_token_last_request to the client. Not ideal but at least it works for me.

I'm really curious to know the real way to find the token though...

jfthuong commented 1 year ago

Maybe it is related to an old and closed issue: https://github.com/microsoft/azure-devops-python-api/issues/152

jeffyoungstrom commented 1 year ago

The fact that the sample code in the project README does not run due to this issue and there has been no response here from the team does not instill a lot of confidence, does it?

jfthuong commented 1 year ago

From one Jeff to another... yes, you are right.

And no much more luck on StackOverflow, even though I launched a bounty.

treyBohon commented 1 year ago

I just tried updating my team's stack and ran into the same issue. I'm confused how it's been this way for so long. Either we're both missing something, or this has in fact been a problem in v6.0.0 for a long time, but you can bypass it by using 6.0.0b4 and specify the v5.1. But that gets back to sanity checking - has it really been broken for about 4 years since 6.0.0 came out and everyone is collectively just working around it by using the v5.1 API?

I'm guessing v6.0.0b4 will work for at least a couple of more years, so we're just planning on checking on this later.

natescherer commented 6 months ago

Worth mentioning that you now have the choose between using Python 3.12 OR having functional pagination as only 7.1.0b4 works on Python 3.12.

Would love to see some TLC from Microsoft on this. Contemplating just using the azure cli's devops extension via subprocess.run since this project appears to be more or less broken and unmaintained.

elfgirl commented 2 months ago

The continuation 'token' is just the pagination results. "200;0" "400;0" "600;0" so you can just loop till the returned results are not == 200 to gather all the items. No need to mess about with the internals.

Super un-intuitive I know and only discovered it by going in to patch things.

DoozerRHS commented 4 weeks ago

Thanks for this thread. For anybody else that lands here... I tried the @elfgirl call sequence above, but it did not work as the continuation token returned for me was an ID not an item count and hence not predictable. However the @jfthuong solution did work for me. Although instructions were clear, as a Python noob it took me a while to get there. So commenting to provide example usage of the proposed patch solution. I agree this needs fixing, would prefer a class returned containing the continuation-token and the list - or an additional alternate method that hides the pagination complexity from the caller. Added a file called PatchedAzureClient.py to project - with code as provided @jfthuong above. Changed my azure interaction class to use it e.g. import PatchedAzureClient as patchingClient` ...


def getChildTestSuites(a_context : SimpleNamespace, a_testPlanId : str) -> List[TestSuite]:
    #set up override behaviour to add continuation token feature
    patchingClient.patch_azure_devops_client()

    #create patched version of client
    testPlanClient : TestPlanClient = a_context.connection.clients.get_test_plan_client()

    retVal : List[TestSuite] = list()

    #Pagination means max 200 records a go - keep calling until got all results
    results : List[TestSuite] = testPlanClient.get_test_suites_for_plan(project="Foo", plan_id=a_testPlanId)
    retVal = retVal + results

    while(testPlanClient.continuation_token_last_request):
        results : List[TestSuite] = testPlanClient.get_test_suites_for_plan(project="Foo", plan_id=a_testPlanId, continuation_token=testPlanClient.continuation_token_last_request)
        retVal = retVal + results

    return retVal