benibela / xidel

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
http://www.videlibri.de/xidel.html
GNU General Public License v3.0
674 stars 42 forks source link

[DOC] How to change output order ? #107

Closed Kochise closed 6 months ago

Kochise commented 1 year ago

Hi, using Vera++ to extract infos, I generate a XML file :

vera++ -c "out.xml" -r "%VERA_BIN%\..\lib\vera++" --profile default --show-rule -i "files_to_analyze.txt">"out.txt"

I get out.xml that look something like :

<?xml version="1.0" encoding="UTF-8"?>
<checkstyle version="5.0">
    <file name="C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c">
        <error source="F002" severity="info" line="1" message="directory structure too deep" />
        <error source="T011" severity="info" line="43" message="closing curly bracket not in the same line or column" />
        <error source="T009" severity="info" line="52" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="52" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="52" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="52" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="53" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="53" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="53" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="53" message="comma should be followed by whitespace" />
        <error source="T011" severity="info" line="68" message="closing curly bracket not in the same line or column" />
        <error source="L001" severity="info" line="116" message="trailing whitespace" />
        <error source="T009" severity="info" line="158" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="165" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="172" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="179" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="186" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="193" message="comma should be followed by whitespace" />
        <error source="T012" severity="info" line="198" message="negation operator used in its short form" />
        <error source="T012" severity="info" line="203" message="negation operator used in its short form" />
        <error source="T009" severity="info" line="210" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="217" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="224" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="231" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="238" message="comma should be followed by whitespace" />
        <error source="T009" severity="info" line="245" message="comma should be followed by whitespace" />
        <error source="T011" severity="info" line="255" message="closing curly bracket not in the same line or column" />
        <error source="T019" severity="info" line="274" message="full block {} expected in the control structure" />
        <error source="T019" severity="info" line="275" message="full block {} expected in the control structure" />
        <error source="T011" severity="info" line="276" message="closing curly bracket not in the same line or column" />
        <error source="T011" severity="info" line="284" message="closing curly bracket not in the same line or column" />
        <error source="T011" severity="info" line="287" message="closing curly bracket not in the same line or column" />
        <error source="T019" severity="info" line="288" message="full block {} expected in the control structure" />
        <error source="T011" severity="info" line="292" message="closing curly bracket not in the same line or column" />
        <error source="T011" severity="info" line="304" message="closing curly bracket not in the same line or column" />
        <error source="T011" severity="info" line="327" message="closing curly bracket not in the same line or column" />
        <error source="T011" severity="info" line="330" message="closing curly bracket not in the same line or column" />
        <error source="L004_120" severity="info" line="331" message="line is longer than 120 characters" />
    </file>
    <file name="C:\Users\toto\Downloads\GLFrontier-win32\src\as68k\as68k.c">
        <error source="F002" severity="info" line="1" message="directory structure too deep" />
        <error source="L003" severity="info" line="1" message="leading empty line(s)" />
        <error source="T009" severity="info" line="29" message="comma should be followed by whitespace" />
...

Then I try to extract relevant information for SonarQube :

xidel "out.xml" --input-format=xml --extract="<checkstyle><file name={$file:=.}><error message={$msg:=.} line={$line:=.} source={$id:=.}/>*</file>*</checkstyle>" 1>"out.tmp" 2>nul

I get out.tmp that look something like :

id := F002
line := 1
msg := directory structure too deep
id := T011
line := 43
msg := closing curly bracket not in the same line or column
id := T009
line := 52
msg := comma should be followed by whitespace
id := T009
line := 52
msg := comma should be followed by whitespace
id := T009
line := 52
msg := comma should be followed by whitespace
id := T009
line := 52
msg := comma should be followed by whitespace
id := T009
line := 53
msg := comma should be followed by whitespace
id := T009
line := 53
msg := comma should be followed by whitespace
id := T009
line := 53
msg := comma should be followed by whitespace
id := T009
line := 53
msg := comma should be followed by whitespace
id := T011
line := 68
msg := closing curly bracket not in the same line or column
id := L001
line := 116
msg := trailing whitespace
id := T009
line := 158
msg := comma should be followed by whitespace
id := T009
line := 165
msg := comma should be followed by whitespace
id := T009
line := 172
msg := comma should be followed by whitespace
id := T009
line := 179
msg := comma should be followed by whitespace
id := T009
line := 186
msg := comma should be followed by whitespace
id := T009
line := 193
msg := comma should be followed by whitespace
id := T012
line := 198
msg := negation operator used in its short form
id := T012
line := 203
msg := negation operator used in its short form
id := T009
line := 210
msg := comma should be followed by whitespace
id := T009
line := 217
msg := comma should be followed by whitespace
id := T009
line := 224
msg := comma should be followed by whitespace
id := T009
line := 231
msg := comma should be followed by whitespace
id := T009
line := 238
msg := comma should be followed by whitespace
id := T009
line := 245
msg := comma should be followed by whitespace
id := T011
line := 255
msg := closing curly bracket not in the same line or column
id := T019
line := 274
msg := full block {} expected in the control structure
id := T019
line := 275
msg := full block {} expected in the control structure
id := T011
line := 276
msg := closing curly bracket not in the same line or column
id := T011
line := 284
msg := closing curly bracket not in the same line or column
id := T011
line := 287
msg := closing curly bracket not in the same line or column
id := T019
line := 288
msg := full block {} expected in the control structure
id := T011
line := 292
msg := closing curly bracket not in the same line or column
id := T011
line := 304
msg := closing curly bracket not in the same line or column
id := T011
line := 327
msg := closing curly bracket not in the same line or column
id := T011
line := 330
msg := closing curly bracket not in the same line or column
id := L004_120
line := 331
msg := line is longer than 120 characters
file := C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c
id := F002
line := 1
msg := directory structure too deep
id := L003
line := 1
msg := leading empty line(s)
id := T009
line := 29
msg := comma should be followed by whitespace
...

The problem is that file := C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c comes very late.

Currently I store the id :=, line := and msg := tags into a temporary file with a fake header to replace them as-soon-as I get the file := tag, but it requires double the work.

I'd like to have it top-most to be able to convert the output into something like :

C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:1: F002: directory structure too deep
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:43: T011: closing curly bracket not in the same line or column
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:52: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:52: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:52: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:52: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:53: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:53: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:53: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:53: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:68: T011: closing curly bracket not in the same line or column
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:116: L001: trailing whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:158: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:165: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:172: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:179: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:186: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:193: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:198: T012: negation operator used in its short form
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:203: T012: negation operator used in its short form
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:210: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:217: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:224: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:231: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:238: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:245: T009: comma should be followed by whitespace
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:255: T011: closing curly bracket not in the same line or column
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:274: T019: full block {} expected in the control structure
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:275: T019: full block {} expected in the control structure
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:276: T011: closing curly bracket not in the same line or column
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:284: T011: closing curly bracket not in the same line or column
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:287: T011: closing curly bracket not in the same line or column
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:288: T019: full block {} expected in the control structure
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:292: T011: closing curly bracket not in the same line or column
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:304: T011: closing curly bracket not in the same line or column
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:327: T011: closing curly bracket not in the same line or column
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:330: T011: closing curly bracket not in the same line or column
C:\Users\toto\Downloads\GLFrontier-win32\src\_host.c:331: L004: line is longer than 100 characters
C:\Users\toto\Downloads\GLFrontier-win32\src\as68k\as68k.c:1: F002: directory structure too deep
C:\Users\toto\Downloads\GLFrontier-win32\src\as68k\as68k.c:43: T011: closing curly bracket not in the same line or column
C:\Users\toto\Downloads\GLFrontier-win32\src\as68k\as68k.c:52: T009: comma should be followed by whitespace
...

Any idea ?

Perhaps there's a way to specify the output format and get the expected file in one command.

Regards.

Reino17 commented 1 year ago

Hello Kochise. If I look at your desired output (just concatenated strings), then I'd say; using 2 for-loops is a much simpler approach:

xidel -s "out.xml" -e "for $x in //file for $y in $x/error return `{$x/@name}:{$y/@line}: {$y/@source}: {$y/@message}`"

xidel -s "out.xml" -e ^"^
  for $x in //file^
  for $y in $x/error^
  return^
  `{$x/@name}:{$y/@line}: {$y/@source}: {$y/@message}`^
"

(if your binary is older than 0.9.9-8787, then use x'...' instead of `...`)