jonluca / har-to-openapi

HAR to OpenAPI spec generator
86 stars 10 forks source link

Trying to run har-to-openapi on large .har file returns nothing #11

Closed Bluscream closed 2 months ago

Bluscream commented 2 months ago

Log

PS C:\Users\blusc\AppData\Local\Temp\psdetest> ls

    Directory: C:\Users\blusc\AppData\Local\Temp\psdetest

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d----          03/09/2024    04:23                har2openapi
-a---          03/09/2024    04:31           2525 index.mjs
-a---          03/09/2024    04:06      706007128 input.har
-a---          03/09/2024    04:32            214 openapi.json
-a---          03/09/2024    04:32            113 openapi.yaml
PS C:\Users\blusc\AppData\Local\Temp\psdetest> node index.mjs
(node:25684) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
{
  spec: {
    openapi: '3.0.0',
    info: { title: 'HarToOpenApi - no valid specs found', version: '1.0.0' },
    paths: {},
    servers: [ [Object] ]
  },
  yamlSpec: 'openapi: 3.0.0\n' +
    'info:\n' +
    '  title: HarToOpenApi - no valid specs found\n' +
    '  version: 1.0.0\n' +
    'paths: {}\n' +
    'servers:\n' +
    '  - url: /\n',
  domain: undefined
}

Source

import { generateSpec } from "har-to-openapi";
import * as fs from 'fs/promises';

// read a har file from wherever you want - in this example its just a root json object
const har = await fs.readFile("input.har");

const openapi = await generateSpec(har, {
    // if true, we'll treat every url as having the same domain, regardless of what its actual domain is
    // the first domain we see is the domain we'll use
    // forceAllRequestsInSameSpec: false,
    // // if true, every path object will have its own servers entry, defining its base path. This is useful when
    // // forceAllRequestsInSameSpec is set
    // addServersToPaths: false,
    // // try and guess common auth headers
    // guessAuthenticationHeaders: true,
    // // if the response has this status code, ignore the body
    // // ignoreBodiesForStatusCodes: [],
    // // whether non standard methods should be allowed (like HTTP MY_CUSTOM_METHOD)
    // relaxedMethods: true,
    // // whether we should try and parse non application/json responses as json - defaults to true
    // relaxedContentTypeJsonParse: true,
    // // a list of tags that match passed on the path, either [match_and_tag] or [match, tag]
    // // tags?: ([string] | [string, string] | string)[] | ((url: string) => string | string[] | void);
    // // response mime types to filter for
    // mimeTypes: ['text/json','application/json'],
    // // known security headers for this har, to add to security field in openapi (e.g. "X-Auth-Token")
    securityHeaders: ['X-Origin-Integrity'],
    // // Whether to filter out all standard headers from the parameter list in openapi
    // filterStandardHeaders: true,
    // // Whether to log errors to console
    // logErrors: true,
    // // a string, regex, or callback to filter urls for inclusion
    // // urlFilter: /https:\/\/www\.pietsmiet\.de\/api\/.*/,
    // // when we encounter a URL, try and parameterize it, such that something like
    // // GET /uuids/123e4567-e89b-12d3-a456-426655440000 becomes GET /uuids/{uuid}
    // attemptToParameterizeUrl: true,
    // // when we encounter a path without a response or with a response that does not have 2xx, dont include it
    // dropPathsWithoutSuccessfulResponse: false,
  });
const { spec, yamlSpec } = openapi;
console.log(openapi);
fs.writeFile("openapi.json", JSON.stringify(spec, null, 4));
fs.writeFile("openapi.yaml", yamlSpec)
// spec = { ... } openapi spec schema document
// yamlSpec = string, "info: ..."
Bluscream commented 2 months ago

I ran a python script to count the number of print("entries:",len(content["log"]["entries"])):

PS C:\Users\blusc\AppData\Local\Temp\psdetest> python .\test.py entries: 3417

it is about 689MB

jonluca commented 2 months ago

Can you share the har? Are you passing in the right path to the har? No valid specs found means you probably aren't passing in the right path to the har file.

Bluscream commented 2 months ago

Are you passing in the right path to the har? No valid specs found means you probably aren't passing in the right path to the har file.

If the har isnt found it gives a actual error:

PS C:\Users\blusc\AppData\Local\Temp\psdetest> node index.mjs                                                                                                                                                            
(node:33076) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
node:internal/fs/promises:638
  return new FileHandle(await PromisePrototypeThen(
                        ^

Error: ENOENT: no such file or directory, open 'C:\Users\blusc\AppData\Local\Temp\psdetest\UserluscAppDataLocalTemppsdetest
iltered.har'
    at async open (node:internal/fs/promises:638:25)
    at async Module.readFile (node:internal/fs/promises:1238:14)
    at async file:///C:/Users/blusc/AppData/Local/Temp/psdetest/index.mjs:5:13 {
  errno: -4058,
  code: 'ENOENT',
  syscall: 'open',
  path: 'C:\\Users\\blusc\\AppData\\Local\\Temp\\psdetest\\Users\bluscAppDataLocalTemppsdetest\filtered.har'
}

Node.js v22.6.0

Also yeah the path is correct as seen in the logs + code. its in the same folder as the index.mjs and the terminal

Can you share the har?

Its very big so i have no idea where i should share it?

Also it does work when i run it through https://github.com/dcarr178/har2openapi but that requires a lot of manual fixing the schema

Bluscream commented 2 months ago

Here's a version pre-filtered to the base api url through a python script which is just 8mb: https://workupload.com/file/THAyctYneks (pw: g) this still results in no valid spec

import os, json, re
from typing import Any
from copy import deepcopy

_EXITCODE = 0

def process_file(file_path):
    with open(file_path, 'r', encoding='utf-8') as f:
        content = json.load(f)
    new_content = deepcopy(content)
    new_content["log"]["entries"] = []
    for entry in content["log"]["entries"]:
        if entry["request"]["url"].startswith("https://www.pietsmiet.de/api/"):
            print(entry["request"]["url"])
            new_content["log"]["entries"].append(entry)
    with open("filtered.har", 'w') as f:  # Write the modified content back to the file
        json.dump(new_content, f, indent=4)
        print(f"\nModified file: {file_path}")

if __name__ == "__main__":
    process_file("input.har")
    exit(_EXITCODE)
Bluscream commented 2 months ago

Got it working! I was forgetting to JSON.parse inbetween reading the file and trying to generate the spec. I just yoinked that from your tests and now it works:

import { generateSpec } from "har-to-openapi";
import { fileURLToPath } from "url";
import path, { dirname } from "path";
import * as fs from 'fs/promises';

const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);

const dataDir = path.join(__dirname, "data");

const readHar = async (name) =>
  JSON.parse(await fs.readFile(name), { encoding: "utf8" });

// read a har file from wherever you want - in this example its just a root json object
const har = await readHar("filtered.har");

const openapi = await generateSpec(har, {
    // if true, we'll treat every url as having the same domain, regardless of what its actual domain is
    // the first domain we see is the domain we'll use
    forceAllRequestsInSameSpec: true,
    // // if true, every path object will have its own servers entry, defining its base path. This is useful when
    // // forceAllRequestsInSameSpec is set
    addServersToPaths: true,
    // // try and guess common auth headers
    guessAuthenticationHeaders: true,
    // // if the response has this status code, ignore the body
    // // ignoreBodiesForStatusCodes: [],
    // // whether non standard methods should be allowed (like HTTP MY_CUSTOM_METHOD)
    relaxedMethods: true,
    // // whether we should try and parse non application/json responses as json - defaults to true
    relaxedContentTypeJsonParse: true,
    // // a list of tags that match passed on the path, either [match_and_tag] or [match, tag]
    // // tags?: ([string] | [string, string] | string)[] | ((url: string) => string | string[] | void);
    // // response mime types to filter for
    // mimeTypes: ['text/json','application/json'],
    // // known security headers for this har, to add to security field in openapi (e.g. "X-Auth-Token")
    securityHeaders: ['X-Origin-Integrity','psde_auth3'],
    // // Whether to filter out all standard headers from the parameter list in openapi
    filterStandardHeaders: false,
    // // Whether to log errors to console
    logErrors: true,
    // // a string, regex, or callback to filter urls for inclusion
    urlFilter: /https:\/\/www\.pietsmiet\.de\/api\/.*/,
    // // when we encounter a URL, try and parameterize it, such that something like
    // // GET /uuids/123e4567-e89b-12d3-a456-426655440000 becomes GET /uuids/{uuid}
    attemptToParameterizeUrl: true,
    // // when we encounter a path without a response or with a response that does not have 2xx, dont include it
    dropPathsWithoutSuccessfulResponse: false,
  });
const { spec, yamlSpec } = openapi;
console.log(openapi);
fs.writeFile("openapi.json", JSON.stringify(spec, null, 4));
fs.writeFile("openapi.yaml", yamlSpec)
// spec = { ... } openapi spec schema document
// yamlSpec = string, "info: ..."
Bluscream commented 2 months ago

Kinda sad that it was able to "parameterize" the /articles/{id} but not the categories/{id}

Code_-_Insiders_E87PwhjBCP

jonluca commented 2 months ago

Great! I'll add a test but yes I just got it working as well.

I just added a new option minLengthForNumericPath that should let you parameterize that path too https://github.com/jonluca/har-to-openapi/commit/73b8f6ff460a2bac09eac7850ee863c1ee93c821