php / doc-base

Tools for the PHP documentation
Other
343 stars 85 forks source link

Strict `[skip-revcheck]` #132

Open alfsb opened 3 months ago

alfsb commented 3 months ago

This is a tentative PR to fix the issues below. This PR changes doc-base's revcheck.php to:

  1. Strictly checks for [skip-revcheck] as the first text on the first line of a commit message;
  2. Strictly ignore a sequence of [skip-revcheck] commits.

Issue one

First observe the contents of commit https://github.com/php/doc-en/commit/8f6fd5c55ab10709a4ff8daf6140dea422c1363c and after his commit message, below:

    Add missing libxml related constants (#3388)
    * [skip-revcheck] Fix whitespace
    * Add missing libxml related constants

This commit should mark the translations as outdated or be skiped?

As now, the code searches for [skip-revcheck] in any part of the commit message. See: https://github.com/php/doc-base/blob/8b3169819f3598ed6ac14597591d1630be4dff04/scripts/revcheck.php#L302-L305

Issue two

Observe these commit history:

That is, a sequence of commits marked with [skip-revcheck]. Should these files be marked as outdated each time there are two or more commits marked with [skip-revcheck] in sequence?

The current code always gets the commit -2 of history when the last commit is marked [skip-revcheck]. See: https://github.com/php/doc-base/blob/8b3169819f3598ed6ac14597591d1630be4dff04/scripts/revcheck.php#L384

Possible another issue

In the case of https://github.com/php/doc-en/commits/765b2d6eec7dfbaeed900b32aa91a1360d73df42/reference/ds/ds.deque.xml , with the hashes and messages as:

4d17b7 [skip-revcheck] Convert class markup to be compatible with DocBook 5.2
6cecca [skip-revcheck] Normalize &Constants; and &Methods; usage (#2703)
b2a26b These should include the namesapce

What hash should appear in revcheck as the "target" of translations and diffs? The most recent (4d17b7) or the last non-skipped (b2a26b)?


The first issue caused about 40 files to not be marked outdated in all translations. The second issue is harder to calculate, as it involves analysing the commit history of each file, against each revtag.

If this PR gets approval, it will generate about a hundred of files to be marked as outdated in all translations, and will also desynchronize the revcheck script of doc-base and web-doc.

I plan to, after merging this, to change the lib version of this code in doc-base/tree/master/scripts/translation to reflect the new strict rules on lib code, and to open issues against code duplication on doc-base and web-doc. I would also resolve the issue against doc-base, but I probably will not tackle the web-doc issue as I do not know to test or deploy this infrastructure.

alfsb commented 2 months ago

PR ready for review and comments. As this changes a tool that is used by (all?) translations, this is a thing I would like to hear comments before merging.

I would especially like to hear from people with access to web-php revcheck.php to comment, and try to synchronize the change here and there.

@Girgias @nilgun

alfsb commented 2 months ago

I don't see a problem with waiting a little, and I'm glad that you can make the change to the web-doc/revcheck.php, and thus not have the problem of desynchronization.

As these changes will trigger revisions of dozens of files in each translation, I think it's important to ping the lists, right before or right after the merge.

With this PR merged, I will try to start the work of removing code duplication from revchecks, so that future changes do not depend on so much repo coordination/code duplication.

alfsb commented 2 months ago

I'm now confident that this PR will only mark files incorrectly skipped (squashed commits that contain an [skip-revcheck]) and will flag files mismarked with a hash of a skipped commit.

This version also runs 10x faster on pt_BR translation, which is somewhat expected as this new code executes way less shell git commands. Before and after timing:

real 0m25,360s
user 0m21,151s
sys 0m4,974s

real 0m1,969s
user 0m1,918s
sys 0m0,329s
nilgun commented 2 months ago

This is a tentative PR to fix the issues below. This PR changes doc-base's revcheck.php to:

1. Strictly checks for `[skip-revcheck]` as the first text on the first line of a commit message;

2. Strictly ignore a sequence of `[skip-revcheck]` commits.

Issue one

First observe the contents of commit php/doc-en@8f6fd5c and after his commit message, below:

    Add missing libxml related constants (#3388)
    * [skip-revcheck] Fix whitespace
    * Add missing libxml related constants

This commit should mark the translations as outdated or be skiped?

As now, the code searches for [skip-revcheck] in any part of the commit message. See:

https://github.com/php/doc-base/blob/8b3169819f3598ed6ac14597591d1630be4dff04/scripts/revcheck.php#L302-L305

This is a committer error. [skip-revcheck] is used incorrectly. This should be two separate commits.

Issue two

Observe these commit history:

* https://github.com/php/doc-en/commits/765b2d6eec7dfbaeed900b32aa91a1360d73df42/language/context/parameters.xml

* https://github.com/php/doc-en/commits/765b2d6eec7dfbaeed900b32aa91a1360d73df42/reference/ds/ds.deque.xml

That is, a sequence of commits marked with [skip-revcheck]. Should these files be marked as outdated each time there are two or more commits marked with [skip-revcheck] in sequence?

The current code always gets the commit -2 of history when the last commit is marked [skip-revcheck]. See:

https://github.com/php/doc-base/blob/8b3169819f3598ed6ac14597591d1630be4dff04/scripts/revcheck.php#L384

These have already been done in all languages' files. Otherwise configure.php would fail.

Possible another issue

In the case of https://github.com/php/doc-en/commits/765b2d6eec7dfbaeed900b32aa91a1360d73df42/reference/ds/ds.deque.xml , with the hashes and messages as:

4d17b7 [skip-revcheck] Convert class markup to be compatible with DocBook 5.2
6cecca [skip-revcheck] Normalize &Constants; and &Methods; usage (#2703)
b2a26b These should include the namesapce

What hash should appear in revcheck as the "target" of translations and diffs? The most recent (4d17b7) or the last non-skipped (b2a26b)?

The first issue caused about 40 files to not be marked outdated in all translations. The second issue is harder to calculate, as it involves analysing the commit history of each file, against each revtag.

If this PR gets approval, it will generate about a hundred of files to be marked as outdated in all translations, and will also desynchronize the revcheck script of doc-base and web-doc.

I plan to, after merging this, to change the lib version of this code in doc-base/tree/master/scripts/translation to reflect the new strict rules on lib code, and to open issues against code duplication on doc-base and web-doc. I would also resolve the issue against doc-base, but I probably will not tackle the web-doc issue as I do not know to test or deploy this infrastructure.

The first issue was already done in all languages' files, as far as I remember. That's why they are skip-revchecked. Otherwise configure.php would fail.

The second issue does not affect translations.

I don't understand why the third issue is objected.

--

I don't think any changes are needed. Committers should be trained better.

alfsb commented 2 months ago

This is a committer error. [skip-revcheck] is used incorrectly. This should be two separate commits.

The original committer used separate commits. The issue above occurs when commits are squashed. This is a fix for squashed commits.

That is, a sequence of commits marked with [skip-revcheck]. Should these files be marked as outdated each time there are two or more commits marked with [skip-revcheck] in sequence?

These have already been done in all languages' files. Otherwise configure.php would fail.

configure.php does not fail when there are consecutive [skip-revcheck] commits. The issue is files being marked outdated when there are two or more [skip-revcheck] commits.

~/phpdoc$ ls -d */
doc-base/  en/  pt_BR/
~/phpdoc$ php doc-base/scripts/revcheck.php pt_BR > rev1.html
~/phpdoc$ cd en/
~/phpdoc/en$ echo " " >> language/context/parameters.xml 
~/phpdoc/en$ git commit -a -m "[skip-revcheck] One."
[master 39d8257a8f] [skip-revcheck] One.
 1 file changed, 1 insertion(+)
~/phpdoc/en$ echo "  " >> language/context/parameters.xml
~/phpdoc/en$ git commit -a -m "[skip-revcheck] Two."
[master ff12fe660d] [skip-revcheck] Two.
 1 file changed, 1 insertion(+)
~/phpdoc/en$ git log -3 --format=%H
ff12fe660d93eda71c146cf84508f6c2a3c613af
39d8257a8f69acdecdfd08d7c3fff4457564b895
be8baf89c0016c1c39ab0b85cafca28c8364e95b
~/phpdoc/en$ cd ..
~/phpdoc$ php doc-base/scripts/revcheck.php pt_BR > rev2.html
~/phpdoc$ cat rev1.html | grep "parameters.xml" | wc
      0       0       0
~/phpdoc$ cat rev2.html | grep "parameters.xml" | wc
      3      13     820

The first issue was already done in all languages' files, as far as I remember. That's why they are skip-revchecked. Otherwise configure.php would fail.

configure.php does not fail. Running php doc-base/configure.php --with-lang=pt_BR before or after the commands above, configure.php will complete sucefully.

nilgun commented 1 month ago

These are DTD changes. It is not possible to apply them to doc-en and not to others.

When I look at it, I see that it is also applied to pt_BR:

https://github.com/php/doc-en/commit/8bc832a464e33122e8129f5a623bd845b69fa7e0

https://github.com/php/doc-pt_br/commit/f59baa1292fed47c43874fe7f8c87a265cba7f0d

alfsb commented 1 month ago

These are DTD changes. It is not possible to apply them to doc-en and not to others. When I look at it, I see that it is also applied to pt_BR:

The link of first issue already shows a case not related with DTD changes: https://github.com/php/doc-en/commit/8f6fd5c55ab10709a4ff8daf6140dea422c1363c

Also, there are dozen of other cases in git history, showing that sequential commits occurs, outside coordinated efforts.


git log --format=%h  %s -- reference/radius/constants.xml
f27cfeeefc  Update examples using md5(...) to use hash('md5', ...) (#3619)
3dbacb703e  Add missing Radius constant IDs
6c946c61e9  [skip-revcheck] Fix whitespace
39729c23a8  [skip-revcheck] Update Radius constant ID (#3215)
86e6094e86  Use canonical type names

git log --format=%h  %s -- reference/curl/constants.xml
b8ea7cb9cf  Change wording from "Available since" to "Available as of"
5743eea6e4  Copy constants from curl_getinfo to dedicated constants page
72b9959ea6  [skip-revcheck] Fix whitespace
47d4ae82e6  [skip-revcheck] Update cURL constant IDs (#3224)
4a0b755422  [PHP 8.3] New cURL constants (#3089)

git log --format=%h  %s -- reference/sockets/constants.xml
cbf8c63630  Add socket constant descriptions (#3499)
10d68da45d  Add missing socket constants
c56da98841  [skip-revcheck] Fix whitespace
b9142963a5  [skip-revcheck] Update socket constant IDs (#3226)
cf5cf50fbc  [PHP 8.3] socket constants update. (#2948)

git log --format=%h  %s -- reference/filesystem/constants.xml

850ac483c8  XInclude GLOB constants in glob.xml
3675d8d7e4  Add GLOB_ERR and description of GLOB constants
1ecf4d6872  [skip-revcheck] Fix whitespace
5e9500ddad  [skip-revcheck] Syncronize predefined constants with stubs - part 1 (#2739)
aab33d6443  Removed references to PHP 5, PHP 7.0 (#544)
86e6094e86  Use canonical type names

git log --format=%h  %s -- reference/image/constants.xml
7f5d646f3e  Add IMG_CROP constants (#3504)
fe955bf734  Add missing image constants
8e875f1349  [skip-revcheck] Fix whitespace
5e9500ddad  [skip-revcheck] Syncronize predefined constants with stubs - part 1 (#2739)
9bef3400f8  Fix GH-1380: Add `IMAGETYPE_AVIF` constant (#2628)
928f051323  [PHP 8.1] Add new global GD constants (#1545)

git log --format=%h  %s -- reference/session/sessionhandler.xml
601f6f4ce5  Update ext/session role attributes (#1987)
2ab45105b4  Stop referring to `openssl_random_pseudo_bytes()` outside of ext/openssl (#1967)
6d1ca753fb  [skip-revcheck] Fix typos
4eb60e8132  [skip-revcheck] Generate ext/session class synopses from stubs
5fabd07880  Removed Changelog entries for PHP 5, PHP 7.0 (#543)
fcfd6ea1d2  Document SessionIdInterface

git log --format=%h  %s -- reference/sodium/sodiumexception.xml
09c49da6f0  Update Error, Exception, and Throwable role attributes (#2069)
8f28fbb43a  Revert "Revert "Generate ext/sodium class synopses from stubs""
6700a28f74  [skip-revcheck] Revert "Generate ext/sodium class synopses from stubs"
1b3769ae7d  [skip-revcheck] Generate ext/sodium class synopses from stubs
8469c26f01  SodiumException has no members

git log --format=%h  %s -- reference/fileinfo/constants.xml
4968cab7de  Add FILEINFO_APPLE constant
790b69f2e0  [skip-revcheck] Fix whitespace
5e9500ddad  [skip-revcheck] Syncronize predefined constants with stubs - part 1 (#2739)
aab33d6443  Removed references to PHP 5, PHP 7.0 (#544)

git log --format=%h  %s -- reference/gmp/constants.xml
18ad21afec  Add GMP_MPIR_VERSION constant
2ff877cdec  [skip-revcheck] Fix whitespace
5e9500ddad  [skip-revcheck] Syncronize predefined constants with stubs - part 1 (#2739)
86e6094e86  Use canonical type names

git log --format=%h  %s -- install/ini.xml
d228d0aa06  [skip-revcheck] Convert info constant tables to variablelists (#3342)
15841d1884  Link INI mode table
327111cc5d  [skip-revcheck] Fix whitespace
d4d5216e7a  [skip-revcheck] Replace PHP_INI_* with INI_* constants
4eeb07225f  Remove old install docs (#662)

git log --format=%h  %s -- language/types/type-juggling.xml
f94d903985  Fix markup issues in language section
183439d468  [skip-revcheck] Correct sentence grammar in type juggling page (#2942)
8870f9d1ce  [skip-revcheck] Remove double "are" (#2689)
4498e3ed86  Fix GH-2077, not just numeric strings can be implicitly converted to int in functions

git log --format=%h  %s -- language/predefined/variables/superglobals.xml
a6d209f4ff  [skip-revcheck] Use <refentry> tag with role attribute instead of custom XML tag
ec6e871a47  [skip-revcheck] Replace role="noversion" attribute usage with the new annotations one
fd6629b1ac  Remove duplicate "Superglobals"

git log --format=%h  %s -- reference/datetime/dateperiod/construct.xml
7d81260767  Document new DatePeriod::createFromISO8601String method in PHP 8.3
71692b6f4c  Add Date/Time exception documentation
02ff7fef5b  Update ext/date role attributes (#1972)
655c1b1b94  [skip-revcheck] Fix typo (offets → offsets)
27f6943df8  Fixed #1291: DatePeriod::__construct(string ) - ISO string does not accept time zone offsets
b7ac6fa547  [skip-revcheck] Fix typos
aab7fca69f  [skip-revcheck] Remove trailing whitespace
c249f3bc56  Incorporate useful information from notes (take #2)

git log --format=%h  %s -- reference/datetime/functions/date-parse.xml
50dcf55ce5  date_parse does not return false (#2395)
19e8122137  Remove references to "double" and use float instead (#1848)
20415705a3  Update date-parse.xml (#1832)
5c951013ca  Incorporate useful information from notes (take #3)
6f845b90b4  [skip-revcheck] Undo false typo
e7b5172917  [skip-revcheck] Typos
bbfdc364de  Swap DateTime and DateTimeImmutable documention to push people towards using DateTimeImmutable
```.
nilgun commented 1 month ago

If your claim is true, the PHP documentation seems to be a mess in terms of translations. If previous commits have made things so confusing, I will not translate the PHP documentation from now on. Do what you want. It does not concern me anymore.

alfsb commented 1 month ago

I haven't looked at the revcheck script in years, and I am busy so if my review is required it might take a bit longer, but I am happy to merge a PR which is made to the relevant doc repo to have them in sync.

I will try, in the following weeks, to make a PR to the other revcheck.php. There is a bunch of [skip-revcheck] commits I want to send, about entities and <literal> usage, that will bump a lot a files with consecutive [skip-revcheck], so it's better to resolve this issue first.