MobileNativeFoundation / XCLogParser

Tool to parse Xcode and xcodebuild logs stored in the xcactivitylog format
Apache License 2.0
1.74k stars 123 forks source link

The file size of parsed json has become huge since Xcode 12. #117

Closed sgr-ksmt closed 3 years ago

sgr-ksmt commented 3 years ago

Hi XCLogParser developers. Thank you for providing a awesome parser library. 🙇 Let me report a new issue as titled.

I'm currently using xclogparser to parse the xactivitylog file to collect the build time for each module, and I report them to my team members.

For Xcode 11, it worked fine and the file size of the parsed json wasn't so huge. However, for Xcode 12, the file size of the parsed json becomes so huge (approximately 7.6GB) by executing xclogparser parse --file <file_name> --reporter flatJson command. I got this json when I parse the xactivitylog whose file size is 3.7MB. It so huge that I can't parse it on CI (with fastlane's custom lane)

here's the example code of parsing: (Some of filepath are replaced with meaningless paths)

lane :parse_xactivitylog do
  activity_file = "path/to/log.xactivitylog"
  output_path = "path/to/output/activity.json"
  sh("#{xclogparser} parse --file #{activity_file} --reporter flatJson --output #{output_path}")
  json_string = File.read(output_path) // <==== Failed to read json file due to a large file
  activity_json = JSON.parse(json_string)
  ....
end

Since I've been faced with this problem, I tried to parse with other reporter styles then I was able to get some information that led me to identify the problem.

Here's the result that I executed parse command with --reporter summaryJson: (Some of information are masked)

{
  "parentIdentifier" : "",
  "fetchedFromCache" : false,
  "title" : "Building workspace XXXX with scheme YYYY and configuration Debug",
  "warningCount" : 2526,
  "duration" : xxxxxxx.xxxxx,
  "startTimestamp" : xxxxxxx.xxxxx,
  "signature" : "Building workspace XXXX with scheme YYYY and configuration Debug",
  "errors" : [
    ...
  ],
  "compilationEndTimestamp" : xxxxxxx.xxxxx,
  "compilationDuration" : xxxxxxx.xxxxx,
  "endDate" : "xxxxxxxxxxxxxxxxxx",
  "errorCount" : 81,
  "domain" : "Xcode.IDEActivityLogDomainType.BuildLog",
  "type" : "main",
  "identifier" : "xxxxxxxxxxxxxxxxxx",
  "buildStatus" : "succeeded",
  "schema" : "YYYY and configuration Debug",
  "subSteps" : [

  ],
  "endTimestamp" : xxxxxxx.xxxxx,
  "architecture" : "",
  "machineName" : "ABC",
  "buildIdentifier" : "xxxx-xxxx-xxxx-xxxx",
  "startDate" : "xxxxxxxxxxxxxxxxxx",
  "warnings" : [
    {
      ... there are so many warning details
    }
  ],
  "documentURL" : "",
  "detailStepType" : "none"
}

According to this result, I seem to have over 2500 warnings... 😨. Also, I think that since Xcode12, the number of warnings has increased and the parsed json now contains a large number of detail characters for those warnings, so the file size has become bloated. However, I don't know why I get so many warnings and bloated file sizes in Xcode 12 and later.

What I actually want to do

Actually, I just want to get the build time for each module and each module's name by referring to the steps that are "type": "target" only. So in my case, I don't need any error details. BTW, with --reporter summaryJson, I can get the total build time, but not for each module.(I mean, --reporter summaryJson can't do what I want to do.)

Based on the above, I have some questions for you.

I hope you can see this issue. Thanks 🙇

ecamacho commented 3 years ago

Hi! Thank you for reporting this. Do you know if it's true that you have 1200 warnings showing in Xcode? I just want to discard that the tool is duplicating them. We recently fixed an issue that generating duplicated errors in the JSON that is part of release 0.2.21.

I'm leaning towards adding an option to don't output the errors and warnings details. I will do some tests about it.

At Spotify we use XCLogParser as a Dependency of other Swift command line application, so we call it as a library and get the result as a Swift Array that we manipulate before storing its content in a database.

sgr-ksmt commented 3 years ago

@ecamacho Thank you for replying.

Do you know if it's true that you have 1200 warnings showing in Xcode?

In my project, there are over 1500 warnings if I execute build on Xcode. (Xcode always displays 999+ warnings on the top of the window)

I'm leaning towards adding an option to don't output the errors and warnings details. I will do some tests about it.

Oh, really? I just want that option! I'm waiting for it. And thank you for sharing a use case at Spotify 🙇

jiaxw32 commented 3 years ago

Xcode 12, same problem here, the xclogparser tool used too much memory when run parse command. image

dump command works fine, but the output file size by parse command is huge

-rw-r--r--  1 username  staff     2016413 12 24 10:13 6022E237-8749-4E8A-AC76-DC3A0C047D2F.xcactivitylog
-rw-r--r--  1 username  staff   101193415 12 24 10:27 dump.json
-rw-r--r--  1 username  staff  2897283500 12 24 10:17 parse_flatJson.json
-rw-r--r--  1 username  staff  2898886818 12 24 10:36 parse_json.json
-rw-r--r--@ 1 username  staff         929 12 24 11:19 parse_summaryJson.json

I inhibit all warnings in Xcode, the summaryJson like this:

{
    "parentIdentifier" : "",
    "fetchedFromCache" : false,
    "title" : "Build xxxxxx",
    "warningCount" : 541,
    "duration" : 660.80073797702789,
    "startTimestamp" : 1608775336.55636,
    "signature" : "Build xxxxxx",
    "errors" : [

    ],
    "compilationEndTimestamp" : 1608775997.3474889,
    "compilationDuration" : 660.79112887382507,
    "endDate" : "2020-12-24T02:13:17.357000Z",
    "errorCount" : 0,
    "domain" : "Xcode.IDEActivityLogDomainType.BuildLog",
    "type" : "main",
    "identifier" : "admin_6022E237-8749-4E8A-AC76-DC3A0C047D2F_1",
    "buildStatus" : "succeeded",
    "schema" : "xxxxxx",
    "subSteps" : [

    ],
    "endTimestamp" : 1608775997.3570981,
    "architecture" : "",
    "machineName" : "admin",
    "buildIdentifier" : "admin_6022E237-8749-4E8A-AC76-DC3A0C047D2F",
    "startDate" : "2020-12-24T02:02:16.556000Z",
    "warnings" : [

    ],
    "notes" : [

    ],
    "documentURL" : "",
    "detailStepType" : "none"
  }
ecamacho commented 3 years ago

@sgr-ksmt @jiaxw32 Can you try the latest release ? You can pass --omit_warnings flag to the parse command: xclogparser parse --reporter json --output report.json --omit_warnings

jiaxw32 commented 3 years ago

@sgr-ksmt @jiaxw32 Can you try the latest release ? You can pass --omit_warnings flag to the parse command: xclogparser parse --reporter json --output report.json --omit_warnings

Hi, @ecamacho , thanks for your work, I try the latest release version, but it doesn't work for me!

$ ./xclogparser version
XCLogParser 0.2.23
$ ./xclogparser parse --file 11A25E79-04FB-47D6-B79E-AB39AA95CAD1.xcactivitylog --reporter json --output report.json --omit_warnings

the report.json is the parse result, file size: 2.48G

-rw-r--r--    1 username  staff     2007818 12 31 09:48 11A25E79-04FB-47D6-B79E-AB39AA95CAD1.xcactivitylog
-rw-r--r--    1 username  staff  2663501617 12 31 09:52 report.json
-rwxr-xr-x@   1 username  staff     3976928 12 30 18:29 xclogparser

when I run the xclogparser tool, it used too much memory yet.

I open the report.json file with vim, found the notes include too much data.

"notes" : [
{
    "characterRangeStart" : 0,
    "startingColumnNumber" : 9,
    "endingColumnNumber" : 9,
    "characterRangeEnd" : 8,
    "detail" : "<module-includes>:1:9: note: in file included from <module-includes>:1:\r#import \"xxx.h\"\r        ^\rxxx.h:17:9: note: in file included from xxx:17:\r#import \"***.h\"\r        ^\r***.h:113:17: warning: pointer is missing a nullability type specifier (_Nonnull, _Nullable, or _Null_unspecified)\r+ (NSDictionary *)functioname;\r                ^\r***.h:113:17: note: insert '_Nullable' if the pointer may be null\r+ (NSDictionary *)functionname;\r                ^\r***.h:113:17: note: insert '_Nonnull' if the pointer should never be null\r+ (NSDictionary *)funcationname;\r   

//     here, I ignore the left data

I try to edit the getNotes function in the ArrayExtension.swift file ignore the notes data, then it works right.

original the code

func getNotes() -> [Notice] {
    return filter {
        $0.type == .note
    }
}

the code after I edit

func getNotes() -> [Notice] {
    return []
//    return filter {
//        $0.type == .note
//    }
}

Does the notes data important, Can you add another param to ignore that? Thanks again!

sgr-ksmt commented 3 years ago

@ecamacho Thanks for your work. I tried to parser with the latest version, but I got the same issue as @jiaxw32.

Here's my result:

...
    "notes" : [
      {
        "characterRangeStart" : 0,
        "startingColumnNumber" : 0,
        "endingColumnNumber" : 0,
        "characterRangeEnd" : 0,
        "detail" : "note: Using new build system\rnote: Building targets in parallel\rnote: Planning build\rnote: Constructing build description\rwarning: The iOS Simulator deployment target ...(too long text)
      },
...

I hope I can ignore notes.detail field by setting another flag like --omit_notes.

sgr-ksmt commented 3 years ago

@ecamacho How is it going?

ecamacho commented 3 years ago

I will take a look this weekend

emish commented 3 years ago

@sgr-ksmt @jiaxw32 Can you try the latest release ? You can pass --omit_warnings flag to the parse command: xclogparser parse --reporter json --output report.json --omit_warnings

Hello. I'm having similar problems as others here (too much memory usage, too big of file outputs). Trying the --omit_warnings flag did not work for me, it doesn't seem to be packaged with v0.2.22:

~/Documents/XCLogParserTest xclogparser version
XCLogParser 0.2.22
~/Documents/XCLogParserTest xclogparser parse --file DCF3A17D-1D94-4D61-A065-E8F0A0BE1EFF.xcactivitylog --reporter html --rootOutput HTML_1 --without_build_specific_information --omit_warnings
Error: Unknown option '--omit_warnings'
Usage: xclogparser parse [--file <file>] [--derived_data <derived_data>] [--project <project>] [--workspace <workspace>] [--xcodeproj <xcodeproj>] [--reporter <reporter>] [--machine_name <machine_name>] [--redacted] [--without_build_specific_information] [--strictProjectName] [--output <output>] [--rootOutput <rootOutput>]
  See 'xclogparser parse --help' for more information.
ecamacho commented 3 years ago

@emish I just released v0.2.25. Can you try that one? It should have the --omit_warnings option and it also adds an --omit_notes option.

@sgr-ksmt can you try v0.2.25? It adds an --omit_notes option

sgr-ksmt commented 3 years ago

@ecamacho Thank you for your effort. I tried to use v0.2.25 but it didn't still work...

I checked your PR and I left a comment to solve --omit_notes option. Could you please check https://github.com/spotify/XCLogParser/pull/126/files#r585419594? 🙇

sgr-ksmt commented 3 years ago

@ecamacho Thank you for quick fixing. I confirmed that --omit_notes worked fine. If @jiaxw32 can get the same good result, I think we can close this issue. 🙏 How about you?

jiaxw32 commented 3 years ago

@ecamacho Thank you for quick fixing. I confirmed that --omit_notes worked fine. If @jiaxw32 can get the same good result, I think we can close this issue. 🙏 How about you?

work fine for me now, nice work! @sgr-ksmt @ecamacho

test log as follows:

# xclogparser version
% ./xclogparser version
XCLogParser 0.2.26

# parse with containing the notes
% ./xclogparser parse --reporter json --output report.json --file  C5A4CC00-2064-4733-BC2C-AF6A599917CF.xcactivitylog

# parse with ignoring the notes
% ./xclogparser parse --reporter json --output report_without_notes.json --file  C5A4CC00-2064-4733-BC2C-AF6A599917CF.xcactivitylog --omit_notes

# list the report file size
% ls -lh *.json
-rw-r--r--  1 user  staff   2.1G  3  3 16:22 report.json
-rw-r--r--  1 user  staff   236M  3  3 16:24 report_omit_notes.json