microsoft / vcpkg

C++ Library Manager for Windows, Linux, and MacOS
MIT License
23.03k stars 6.36k forks source link

Missing whitespace in `vcpkg portsdiff` output makes parsing output infeasible for long port names #40030

Closed rtzoeller closed 2 months ago

rtzoeller commented 2 months ago

Describe the bug I am interested in parsing the output of vcpkg portsdiff as part of some tooling to automatically update vcpkg manifest files.

Currently a regex like - (\S+)\s+(\S+) -> (\S+) is capable of extracting nearly all port updates, but it fails for ports which have long names due to vcpkg omitting whitespace. A similar issue impacts the "added ports" section, but I'm not currently parsing that.

See https://regex101.com/r/8w4SsK/1 for an example.

Environment

To Reproduce Steps to reproduce the behavior:

  1. ./vcpkg portsdiff 101cc9a69a1061969caf4b73579a34873fdd60fe 821100d967e1737d96414a308e3f7cbe0d1abf18
  2. Observe the output for the azure-storage-* ports.

Expected behavior vcpkg portsdiff should always have at least one whitespace character between the port name and port version.

Failure logs

Console output ``` $ ./vcpkg portsdiff 101cc9a69a1061969caf4b73579a34873fdd60fe 821100d967e1737d96414a308e3f7cbe0d1abf18 The following 10 ports were added: - audit 4.0.1 - cpprealm 2.1.0 - htscodecs 1.6.0 - htslib 1.20 - libcgroup 3.1.0 - libsersi 0.1.0 - opentelemetry-cpp-contrib-version2024-06-17 - soapysdr 0.8.1 - spglib 2.4.0 - tdscpp 20240707 The following 143 ports were updated: - ableton-link 3.1.1 -> 3.1.1#1 - abseil 20240116.2#2 -> 20240116.2#3 - arg-router 1.4.0 -> 1.4.0#1 - arrayfire 3.8.0#5 -> 3.8.0#6 - asmjit 2023-03-25 -> 2024-06-28 - assimp 5.4.0#1 -> 5.4.2 - azure-core-cpp 1.12.0 -> 1.13.0 - azure-storage-blobs-cpp12.11.0 -> 12.12.0 - azure-storage-common-cpp12.6.0 -> 12.7.0 - azure-storage-files-datalake-cpp12.10.0 -> 12.11.0 - azure-storage-files-shares-cpp12.9.0 -> 12.10.0 - azure-storage-queues-cpp12.2.0#1 -> 12.3.0 - bgfx 1.127.8725-469#1 -> 1.128.8777-475 - blend2d 2023-06-16#1 -> 2024-07-08 - breakpad 2023-06-01#1 -> 2023-06-01#2 - cachelib 2024.06.24.00 -> 2024.07.15.00 - cdt 1.4.0 -> 1.4.1 - cnats 3.8.0 -> 3.8.2 - configcat 4.0.1 -> 4.0.3 - cpp-sort 1.15.0 -> 1.16.0 - cppgraphqlgen 4.5.5 -> 4.5.7 - cpptrace 0.6.2 -> 0.6.3 - crashpad 2024-04-11 -> 2024-04-11#1 - cserialport 4.3.0#1 -> 4.3.1 - ctbignum 2019-08-02#4 -> 2019-08-02#5 - curl 8.8.0#3 -> 8.8.0#4 - cxxgraph 2.0.0 -> 4.1.0 - dataframe 3.1.0 -> 3.2.0 - dav1d 1.4.0 -> 1.4.0#1 - dbus 1.15.8#4 -> 1.15.8#5 - dstorage 1.2.2 -> 1.2.3 - dv-processing 1.7.9#1 -> 1.7.9#2 - ecal 5.12.0#1 -> 5.13.2 - edflib 1.25 -> 1.26 - fastgltf 0.7.1 -> 0.7.2 - fastio 2023-11-06 -> 2024-07-05 - fbgemm 0.4.1 -> 0.4.1#1 - fbthrift 2024.06.24.00 -> 2024.07.15.00 - ffmpeg 6.1.1#10 -> 6.1.1#11 - fftw3 3.3.10#8 -> 3.3.10#9 - fizz 2024.06.24.00 -> 2024.07.15.00 - flecs 3.2.11 -> 4.0.0 - folly 2024.06.24.00#1 -> 2024.07.15.00 - geographiclib 2.3#1 -> 2.4 - gettext-libintl0.22.5#1 -> 0.22.5#2 - ginkgo 1.7.0 -> 1.8.0 - glaze 2.6.9 -> 2.9.2 - glslang 14.2.0 -> 14.2.0#1 - google-cloud-cpp2.25.0 -> 2.26.0#1 - gtk 4.10.5#2 -> 4.14.0#1 - gtkmm 4.10.0#1 -> 4.14.0 - hdf5 1.14.4.3#1 -> 1.14.4.3#2 - icu 74.2#3 -> 74.2#4 - imageinfo 2024-02-21 -> 2024-07-14 - imgui 1.90.7 -> 1.90.7#1 - itk 5.3rc02#1 -> 5.4.0 - kerbal 2024.5.1 -> 2024.6.1 - krb5 1.21.2#4 -> 1.21.3 - libass 0.17.2 -> 0.17.3 - libbson 1.27.3 -> 1.27.4 - libiconv 1.17#3 -> 1.17#4 - libidn2 2.3.7 -> 2.3.7#1 - libjpeg-turbo 3.0.2#1 -> 3.0.3 - liblas 1.8.1#14 -> 1.8.1#15 - liblsquic 3.3.2 -> 3.3.2#1 - libmediainfo 24.5 -> 24.6 - libmodplug 0.8.9.0#12 -> 0.8.9.0#13 - libnick 2024.6.9 -> 2024.7.2 - libopenmpt 0.7.4 -> 0.7.4#1 - liborigin 3.0.2#2 -> 3.0.3 - libpqxx 7.9.0 -> 7.9.0#1 - librdkafka 2.3.0#1 -> 2.3.0#3 - libsrt 1.5.3#1 -> 1.5.3#2 - libssh 0.10.5#2 -> 0.10.6 - libsvm 3.32 -> 3.32#1 - libsystemd 255#2 -> 256.2 - libtorch 2.1.2#2 -> 2.1.2#3 - liburing 2.6 -> 2.6#1 - libuv 1.46.0#1 -> 1.48.0 - libxml2 2.11.7 -> 2.11.8 - libyuv 1857 -> 1857#1 - llvm 18.1.6 -> 18.1.6#1 - lz4 1.9.4#1 -> 1.9.4#2 - magic-enum 0.9.5 -> 0.9.6 - magnum 2020.06#18 -> 2020.06#19 - mbedtls 2.28.7 -> 2.28.8 - mimalloc 2.1.2#3 -> 2.1.7 - minizip-ng 4.0.5 -> 4.0.7 - mnn 1.1.0#5 -> 1.1.0#6 - mongo-c-driver 1.27.3 -> 1.27.4 - mongo-cxx-driver3.10.1 -> 3.10.2 - mvfst 2024.06.24.00 -> 2024.07.15.00 - nanoflann 1.5.5 -> 1.6.0 - nanopb 0.4.8 -> 0.4.8#1 - nativefiledialog-extended1.1.1 -> 1.2.0 - netgen 6.2.2401#1 -> 6.2.2401#2 - opencolorio 2.2.1#2 -> 2.2.1#3 - opencv4 4.8.0#20 -> 4.8.0#21 - openimageio 2.5.12.0#1 -> 2.5.12.0#2 - opentelemetry-cpp1.14.2#2 -> 1.16.0#2 - osgearth 3.4#2 -> 3.4#3 - paho-mqttpp3 1.3.2 -> 1.4.1 - polyhook2 2024-02-08 -> 2024-06-03 - proj 9.4.1 -> 9.4.1#1 - proxygen 2024.06.24.00 -> 2024.07.15.00 - qca 2.3.7#1 -> 2.3.7#2 - qt-advanced-docking-system4.3.0 -> 4.3.1 - qtconnectivity 6.7.2 -> 6.7.2#1 - qtsensors 6.7.2 -> 6.7.2#1 - qtwebchannel 6.7.2 -> 6.7.2#1 - quill 4.5.0 -> 5.0.0 - rbdl 3.3.0#6 -> 3.3.0#7 - realm-core 14.8.0 -> 14.10.4 - rxqt bb2138c#1 -> d0b1535#1 - saucer 2.1.0 -> 2.3.0 - sdbus-cpp 1.5.0 -> 2.0.0 - sdl2 2.30.3#1 -> 2.30.5#1 - seacas 2022-11-22#5 -> 2022-11-22#6 - sese 2.1.2 -> 2.2.0#1 - shader-slang 2024.1.12 -> 2024.1.25 - shiftmedia-libgnutls3.7.6#4 -> 3.8.3 - simd 5.3.128#1 -> 6.1.139 - simdutf 5.2.8 -> 5.3.0 - skia 124#1 -> 127 - small-gicp 0.1.0 -> 0.1.2 - sqlite3 3.46.0#1 -> 3.46.0#2 - stdexec 2024-06-16 -> 2024-06-16#2 - tbb 2021.11.0 -> 2021.13.0 - thrift 0.20.0 -> 0.20.0#1 - tinygltf 2.8.23 -> 2.9.2 - tracy 0.10.0#2 -> 0.11.0 - unittest-cpp 2.0.0#5 -> 2.0.0#6 - v8 9.1.269.39#7 -> 9.1.269.39#8 - vtk 9.3.0-pv5.12.1#1 -> 9.3.0-pv5.12.1#2 - wangle 2024.06.24.00 -> 2024.07.15.00 - wg21-linear-algebra0.7.3 -> 0.7.3#1 - wolfssl 5.7.0#1 -> 5.7.2 - wxwidgets 3.2.5#1 -> 3.2.5#3 - yara 4.5.1 -> 4.5.1#1 - yyjson 0.9.0 -> 0.10.0 - zlmediakit 2024-03-30#2 -> 2024-03-30#3 - ztd-cuneicode 2023-11-03 -> 2023-11-03#1 - ztd-text 2023-11-03 -> 2023-11-03#1 ```

Additional context Maybe there's another entrypoint I should be going through to extract this data?

jimwang118 commented 2 months ago

Confirmed that the issue has been reproduced locally.