microsoft / restler-fuzzer

RESTler is the first stateful REST API fuzzing tool for automatically testing cloud services through their REST APIs and finding security and reliability bugs in these services.
MIT License
2.61k stars 303 forks source link

Wrong dynamic name/variable dependency in compiling output #910

Open HangyiWang opened 3 months ago

HangyiWang commented 3 months ago

Description

When I run restler compile to generate grammar files, it creates a wrong dynamic dependency on one of the API requests so restler has to create another unrelated resource before that request.

Steps to reproduce

This is the partial resource definitions for my PUT /api/assets/{assetName}.

AssetProperties:
    properties:
      assetEndpointProfileRef:
        type: string
      attributes:
        additionalProperties: {}
        type: object
      datasets:
        items:
          $ref: '#/definitions/AssetDataset'
        type: array
      defaultDatasetsConfiguration:
        type: string
      defaultEventsConfiguration:
        type: string
      defaultTopic:
        $ref: '#/definitions/Topic'
...
AssetDataset:
    properties:
      dataPoints:
        items:
          $ref: '#/definitions/AssetDataPoint'
        type: array
      datasetConfiguration:
        type: string
      name:
        type: string
      topic:
        $ref: '#/definitions/Topic'
    type: object
AssetDataPoint:
    properties:
      capabilityId:
        type: string
      dataPointConfiguration:
        type: string
      dataSource:
        type: string
      name:
        type: string
      observabilityMode:
        type: string
    type: object

I also have another path called PUT /api/datasets/{datasetName}, which does nothing with the assets API.

But in assets request in grammar.py, I saw _api_datasets__datasetName__put_name.reader() which comes from the dataset creation API. Look at the name under datasetConfiguration. It should be a simple string.

primitives.restler_static_string(""",
            "datasets":
            [
                {
                    "dataPoints":
                    [
                        {
                            "capabilityId":"""),
    primitives.restler_fuzzable_string("fuzzstring", quoted=True),
    primitives.restler_static_string(""",
                            "dataPointConfiguration":"""),
    primitives.restler_fuzzable_string("fuzzstring", quoted=True),
    primitives.restler_static_string(""",
                            "dataSource":"""),
    primitives.restler_fuzzable_string("fuzzstring", quoted=True),
    primitives.restler_static_string(""",
                            "name":"""),
    primitives.restler_fuzzable_string("fuzzstring", quoted=True),
    primitives.restler_static_string(""",
                            "observabilityMode":"""),
    primitives.restler_fuzzable_string("fuzzstring", quoted=True),
    primitives.restler_static_string("""
                        }
                    ],
                    "datasetConfiguration":"""),
    primitives.restler_fuzzable_string("fuzzstring", quoted=True),
    primitives.restler_static_string(""",
                    "name":"""),
    primitives.restler_static_string(_api_datasets__datasetName__put_name.reader(), quoted=True),
    primitives.restler_static_string(""",
                    "topic":
                        {
                            "path":"""),
    primitives.restler_fuzzable_string("fuzzstring", quoted=True),
    primitives.restler_static_string(""",
                            "retain":"""),
    primitives.restler_fuzzable_bool("true"),
    primitives.restler_static_string("""
                        }
                }
            ],
            "defaultDatasetsConfiguration":"""),
    primitives.restler_fuzzable_string("fuzzstring", quoted=True),
    primitives.restler_static_string(""",
            "defaultEventsConfiguration":"""),

If I remove /api/datasets/{datasetName} in my swagger input, restler generates correct file.

And I didn't set any producer/consumer relationships in my annotation file.

Expected results

_api_datasets__datasetName__put_name.reader() should be a simple fuzz string.

Actual results

_api_datasets__datasetName__put_name.reader() comes from a wrong restler logic?

Environment details

marina-p commented 2 months ago

Hello @HangyiWang,

This issue is due to imprecision in one of the heuristics in RESTler for identifying dependencies. If these names should simply be arbitrary strings, adding the following to the dictionary should resolve the incorrect dependencies in the grammar (or adding a custom payload generator for the strings if the names must be unique):

  "restler_custom_payload": {

    "/properties/datasets/[0]/name": ["any", "string"],
    "/properties/status/datasets/[0]/name": ["any", "string"]

  },

Unfortunately, this same imprecision could impact new APIs with a similar naming pattern in the body and those new properties would need to be added similarly to the above. (Please keep this issue open to track improving the dependency analysis to fix the root cause).

Thanks,

Marina

HangyiWang commented 2 months ago

Thanks!