Closed domwhewell-sage closed 2 months ago
These were the workspaces that were deemed to be "in-scope" from my search
[SCAN] acoustic_nathan (SCAN:53703fa1eedb9bfe473c86c237874a195f6911ab) TARGET (in-scope, target)
[ORG_STUB] Sage TARGET (in-scope, target)
[CODE_REPOSITORY] {"url": "https://www.postman.com/aerospace-operator-71300412/sage-intacct-api"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/infoconn-netsuite/intact-sage-ws"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/sagenda/sagenda-s-public-workspace"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/hmn1996/sage50cloudintegration"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/bmscat/sage-intacct"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/hygge-sage-food/sagefood"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/nicobulzansage/my-qa-sage-workspace"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/kaydd/dd-sage"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/payload-cosmologist-32319016/sage-intacct-dev-team"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/orange-astronaut-147938/sage-200-api-requests"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/flight-explorer-20084619/silicone-sages"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/kcharles1902/sage-300-people"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/winter-astronaut-940577/sagex-api"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/lunar-moon-898681/sageworkspace"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/jjamesecg/sage-workspace"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/gold-crater-125745/sage-apis"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/sage-network-enablement/sage-network-public"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/docking-module-cosmologist-91951398/sage"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/orange-rocket-719414/sage"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/lockstepapi/banking-service-sage"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/orange-desert-64565/sage"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/mission-observer-77863399/sage-intacct"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/de-sage/de-sage-s-public-workspace"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/ScottySurrao/sage-portal"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/research-geoscientist-84013159/sage-web-services"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/dark-meteor-860165/sage-api"} postman (cdn-cloudflare, distance-1, postman)
[CODE_REPOSITORY] {"url": "https://www.postman.com/cloudy-firefly-825356/sageaccountingapi"} postman (cdn-cloudflare, distance-1, postman)
This works well for some domains, since a small amount of noise is tolerable. But for others, especially ones with common stubs like hotels.com
, buy.com
, etc., it can be very problematic.
Is it possible to filter down the search to only ones that contain the full target domain?
EDIT: relevant issue:
My initial thought was to change the if statement to a regex matcher with word boundaries to ensure we are getting the org_name and only the org name but there is another option...
Maybe if we search the org_name request all of the workspaces, collections, environments and before saving it to disk / raising a new FILESYSTEM event do something like
for k, v in json.items():
if (
isinstance(v, str)
and (
self.helpers.is_dns_name(v, include_local=False)
or self.helpers.is_url(v)
or self.helpers.is_email(v)
)
and self.scan.in_scope(v)
):
self.verbose(f'Found in-scope key "{k}": "{v}" for {org}, it appears to be in-scope')
in_scope = True
break
and discard the workspaces that do not contain a dns_name that is in scope
Yeah I think that would be a good solution
Describe the bug I'm not sure the in scope check for the postman module is robust enough
This if statement matches if the org name is contained in the name of the workspace. For example I have an org name
Sage
This check matches the workspace nameSagenda's Public Workspace
which probably shouldn't be in my scope. But workspaces likeSage Intacct Workspace
,Sage 200 API
should (Im probably using a little human intuition here as I know these are sage products)I'm not sure if a regex matcher would be better to check if the org name is not in the middle of a word or conjoined with another. Its more difficult than github as orgs don't necessarily have their own postman profile (Or at least they don't regularly in my experience)