Closed snuxs closed 2 years ago
Hi, thank you for pointing this out. Short answer is: version identification for extensions sucks and yes, Typo3Scan needs to be adapted to add priority to the paths. Maybe I can figure something out.
Long answer is: version identification for extensions sucks, because there are many issues with the files. The main issues I found are:
My solution to this was: download all extensions and get the most common files which could include version information (these are the ones in extensions.py). If such a file exists, report it. If not specified in extensions.py a generic regex ('([0-9]+.[0-9]+.[0-9x][0-9x]?)') is used for searching for version info. This may be the reason you see the "last modified date".
As soon as a version string is found, the scanner aborts requesting other files. This is probably the reason why the settings.cfg is not requested in your case.
Just as an example to illustrate my point. metaseo_tqseo_migration
Version is 1.0.0/stable, but this version is nowhere in the extension files. Only version info in Settings.yaml is
version: 6.0 release: 6.0.0
which is basically the supported Typo3 version. So yeah... extension versions are not reliable and its a mess.
Hi, thank you for your fast and complete answer!
The "Last Modified" Date does not have to be in a file, it can also be the date when a file in the searched directory has been modified? That was also something that confused me.
I thought it works something like that, but I missed, that the scanner aborts requesting other files.
I tried to remove all lines except the line where it searches through settings.cfg and now the scanner does not find a version at all.
After that I tried a request and checked the response of settings.cfg and there was a match for the RegEx
(?:release:)\s?([0-9]+\.[0-9]+\.?[0-9]?[0-9]?)
I suppose that is an issue on my (web)site, because the same request works for other extensions.
But thank you for your time and help, with that knowledge I will try to force the settings.cfg request for the mask extension.
The "Last Modified" Date does not have to be in a file, it can also be the date when a file in the searched directory has been modified? That was also something that confused me.
Nah, this cannot be the case. You just request the file and search the file content for version info. You don't have access to the file system.
After that I tried a request and checked the response of settings.cfg and there was a match for the RegEx (?:release:)\s?([0-9]+.[0-9]+.?[0-9]?[0-9]?)
Interesting. And the version info is not reported by Typo3Scan? Whats done after the match?
Nah, this cannot be the case. You just request the file and search the file content for version info. You don't have access to the file system.
Alright, good to know.
Interesting. And the version info is not reported by Typo3Scan? Whats done after the match?
Actually I am trying to reproduce at the moment but cannot figure out what I did before. I just tried it in console. Now I am trying to match it again with the RegEx but it does not match. Sorry, dont know what I did there.
Still in my understanding it should match(although im not too good with RegEx) but maybe you see the issue.
Here are the first 4 lines of the response from the Settings.cfg:
[general]
project = Mask
release = 7.0.10
copyright = 2021
Yeah, matching regex is:
(?:release\s=)\s?([0-9]+\.[0-9]+\.?[0-9]?[0-9]?)
And to be able to also catch release: 7.0.10 you should use:
(?:release\s?[=:])\s?([0-9]+\.[0-9]+\.?[0-9]?[0-9]?)
Edit: there are plenty of regex validators out there. E.g. https://extendsclass.com/regex-tester.html#python
Awesome! Both regex work for me. Thank you!
(?:release\s?[=:])\s?([0-9]+.[0-9]+.?[0-9]?[0-9]?)
If I would use this regex in the scanner it works for "release:" and release=" so I probably would not run into any issues if there should be the normal case I guess?
Edit: there are plenty of regex validators out there. E.g. https://extendsclass.com/regex-tester.html#python
will use that the next time, thank you for the hint!
If I would use this regex in the scanner it works for "release:" and release=" so I probably would not run into any issues if there should be the normal case I guess?
Yes, this will work. Making a push request right now.
Could you please use the dev branch and see if it works as intended?
On it. First test without removing the Changelog lines gave me the date again. I am trying again.
The "Last Modified" Date does not have to be in a file, it can also be the date when a file in the searched directory has been modified? That was also something that confused me.
Nah, this cannot be the case. You just request the file and search the file content for version info. You don't have access to the file system.
Sorry I have to ask again, did not completely get it. If I have a look in "/mask/Documentation/ChangeLog/" then I can see 2 folders and one "file.rst". I dont think that the scanner searches through the folders. But inside the "file.rst" there is no date at all. The version output I get is the last modified date for the file and folders(all were modified on the same date).
Furthermore, if the scanner requests a file and reads the content, how is it possible, that the scanner can get a version information out of "/Documentation/Changelog" if its not looking for a file like "/Documentation/Changelog/file.xyz"?
Could you please use the dev branch and see if it works as intended?
Worked as intended for my problem with the mask extension!
This will probably also solve the problems for MetaSEO and [clickstorm]SEO Will report that tomorrow after the scans are finished
Sorry I have to ask again, did not completely get it. If I have a look in "/mask/Documentation/ChangeLog/" then I can see 2 folders and one "file.rst". I dont think that the scanner searches through the folders. But inside the "file.rst" there is no date at all. The version output I get is the last modified date for the file and folders(all were modified on the same date).
Furthermore, if the scanner requests a file and reads the content, how is it possible, that the scanner can get a version information out of "/Documentation/Changelog" if its not looking for a file like "/Documentation/Changelog/file.xyz"?
Regexes are used to search for version info in the following files:
/doc/manual.sxw
/composer.json
/doc/manual.pdf
/doc/manual.odt
/Documentation/Settings.yml
/Documentation/Settings.yaml
/Documentation/Settings.cfg
/ChangeLog.txt
/Documentation/ChangeLog
/CHANGELOG.md
Your reported version must be somewhere in one of them. Reading and understanding the source code will also help to answer your questions.
Thanks for the explanation. I probably overlooked it in the files. But the scanner also outputs a .json where it also documents the path to the version file and that path is "/Documentation/Changelog", so no exact file. I worked around it, was just curious.
The updated regex works really good for me. I am able to find mask, MetaSEO and [clickstorm]SEO versions in most cases now! Thank you for your help!
Finding wrong extension version, even tho file with right version exists\ I am trying to scan for TYPO3 and extension versions. The issue is, that for a few extensions the scanner seems to be going in the wrong direction.
The scanner uses the path
[url]/Documentation/ChangeLog
to find a version of the extension. From there it uses the "Last Modified" date as version. This is only sometimes (for some extensions) the case.Furthermore, there are files for the extensions with the correct version in
[url]/Documentation/Settings.cfg
from my understanding they also get searched but the scanner dismisses the result of them. (Maybe I am wrong here, but in the "extensions.py" file it looks like it)If I remove the line where the scanner opens
[url]/Documentation/ChangeLog
it still does not use the[url]/Documentation/Settings.cfg
path. In this case the scanner uses[url]/CHANGELOG.md
. In this path there is no version information in my case but the scanner uses the first number that it can find. ( an issue with the Changelog.md, not the scanner)My question is, if there is any way to change the priority of the scanned paths. I do not understand why the
[url]/Documentation/ChangeLog
path is used and not the[url]/Documentation/Settings.cfg
Extensions I have had this problem with\
Thank you for your help and your tool!