Open LeeThompson opened 1 year ago
--help output as of 202306011529
Usage: get-fav.php (Switches)
Available APIs: faviconkit, favicongrabber, google, iconhorse (get-fav-api.ini)
Lists can be separated with space, comma or semi-colon.
--configfile=FILE Pathname to read for configuration.
--list=FILE/LIST Pathname or a delimited list of URLs to check.
--blocklist=FILE/LIST Pathname or a delimited list of MD5 hashes to block.
--validtypes=FILE/LIST Valid icon types (default is gif,webp,png,ico,bmp,svg,jpg)
--logfile=FILE Pathname for log file (default is get-fav.log)
--path=PATH Location to store icons (default is ./)
--size=NUMBER Try to get icon size (default is 16)
--tryhomepage Try homepage first, then APIs. (default is true)
--onlyuseapis Only use APIs.
--disableapis Don't use APIs.
--enableblocklist Enable blocklist. (default is true)
--disableblocklist Disable blocklist.
--store Store favicons locally. (default is true)
--nostore Do not store favicons locally.
--overwrite Overwrite local favicons. (default is false)
--skip Skip local favicons.
--removetld Remove top level domain from filename. (default is false)
--noremovetld Don't remove top level domain from filename.
--tenacious Try all enabled APIs until success. (default is false)
--notenacious Try a random API.
--allowoctetstream Allow MimeType 'application/octet-stream'. (default is false)
--disallowoctetstream Block MimeType 'application/octet-stream' for icons.
--consolemode Force console output.
--noconsolemode Force HTML output.
--debug Enable debug mode.
--help This listing and exit.
--version Show version and exit.
Advanced:
--user-agent=AGENT_STRING Customize the user agent.
--nocurl Disable cURL.
--bufferhttp Buffer HTTP page loading. (default is true)
--nobufferhttp Disable HTTP page load buffering.
--curl-verbose Enable cURL verbose.
--curl-progress Enable cURL progress bar.
--enableapis=FILE/LIST Filename or a delimited list of APIs to enable.
--disableapis=FILE/LIST Filename or a delimited list of APIs to disable.
--http-timeout=SECONDS Set HTTP timeout. (default is 60).
--connect-timeout=SECONDS Set HTTP connect timeout. (default is 30).
--dns-timeout=SECONDS Set DNS lookup timeout. (default is 120).
Logging:
--log Enable debug logging. (default is false)
--nolog Disable debug logging.
--append Append debug log. (default is true)
--noappend Always overwrite debug log.
--timestamp Enable debug log timestamps. (default is true)
--notimestamp Do not show timestamps in debug log.
--loglevel=NUMBER Set debug logging level. (default is 255)
Console:
--level=NUMBER Set debug logging level. (default is 31)
--showtimestamp Enable debug log timestamps. (default is false)
--hidetimestamp Do not show timestamps in debug log.
Notes:
Configuration Files Use INI file format. Each value is optional. Comments can be used "; " etc. Complex strings need to be quoted. (See the useragent entry below).
[files]
overwrite=true
store=true
local_path=./
[http]
try_homepage=true
http_timeout=60
useragent="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/113.0"
[curl]
enabled=true
[global]
debug=true
Notes on the blocklist concept:
This is already done in get-fav with the google API and the default icon. This simply allows a list of md5 hashes of other icons for the program to ignore.
get-fav-api.ini
format
;
can be used for comments<DOMAIN>
, <APIKEY>
and <SIZE>
will be substituted at runtimename
must match exactlyField | Description |
---|---|
name | ID of the definition (used for enable/disable) |
display | Cosmetic Display Name (defaults to name) |
url | API URL (if it contains = and certain other characters it needs to be quoted) |
json | Does API return json format? |
apikey | Does the API require a key? (not tested) |
enabled | Is this definition enabled? |
If a json structure is used, it is defined as follows with "json_structure[field] = "item"
in the section, for example:
json_structure[icons] = "icons"
json_structure[link] = "src"
json_structure[sizeWxH] = "sizes"
json_structure[mime] = "type"
json_structure[error] = "error"
Supported Fields are (so far):
Sample:
;
; PHP-Grab-Favicon
; APIs
;
[faviconkit]
display=FavIconKit
name=faviconkit
url=https://api.faviconkit.com/<DOMAIN>/<SIZE>
json=false
enabled=true
[favicongrabber]
display=FavIconGrabber
name=favicongrabber
url=http://favicongrabber.com/api/grab/<DOMAIN>
json=true
enabled=true
json_structure[icons] = "icons"
json_structure[link] = "src"
json_structure[sizeWxH] = "sizes"
json_structure[mime] = "type"
json_structure[error] = "error"
[google]
display=Google
name=google
url="http://www.google.com/s2/favicons?domain=<DOMAIN>&sz=<SIZE>"
json=false
enabled=true
[iconhorse]
display=Icon Horse
name=iconhorse
url=https://icon.horse/icon/<DOMAIN>
json=false
enabled=true
Debug Log File Information
Define | Value | Description |
---|---|---|
TYPE_ALL |
1 | Should always be output |
TYPE_NOTICE |
2 | Important information |
TYPE_WARNING |
4 | Potential issue |
TYPE_VERBOSE |
8 | Extra information |
TYPE_ERROR |
16 | Something has gone wrong |
TYPE_DEBUGGING |
32 | Debug message, usually tops of functions |
TYPE_TRACE |
64 | Extra debug messaging, usually sub/helper functions |
TYPE_SPECIAL |
128 | Special debug messaging, usually sub/helper functions |
The "shipping" default is 31 which is all bug debug and trace.
The timestamp, by default uses Y-m-d H:i:s
which looks like 2023-05-25 17:27:39
. There isn't a switch to change it but it can be changed in the .ini file:
The default log separator used if it is appending to an existing log file is 80 *
's. This cannot be changed via a switch but can also be changed in the .ini file.
[logging]
timestampformat="Y-m-d H:i:s"
separator=(whatever)
Switches:
Files: | Switch | Description |
---|---|---|
--loglevel=NUMBER |
Log level to use, for everything generally you want 255 | |
--logfile=FILE |
Pathname for log file (default is get-fav.log) | |
--log / --nolog |
Enable/Disable Log File | |
--append / --noappend |
Enable/Disable Appending the Log File | |
--timestamp / --notimestamp |
Use Timestamps in Log FIle or Not |
Console: | Switch | Description |
---|---|---|
--level=NUMBER |
Log level to use, for everything generally you want 255 | |
--showtimestamp / --hidetimestamp |
Use Timestamps on Console |
Configuration Options:
[logging]
enabled=true/false
append=true/false
level=value
pathname=filename or full path
separator=separator to use when appending
timestamp=true/false
timestampformat="Y-m-d H:i:s"
[console]
enabled=true/false
level=value
timestamp=true/false
timestampformat="Y-m-d H:i:s"
Notes:
date.timezone
being set correctly in the php.ini
file.define('ENABLE_WEB_INPUT', true);
in the script which is not the default for security reasons.Variable | Internal/INI File | Switch | Comments |
---|---|---|---|
GETFAVDEBUG |
debug |
--debug |
Enables special debug mode |
Status:
June 23rd 2023 Haven't been able to do much work this week due to some unexpected household emergencies, should be back at it next week.
202306161401
--showhttpwarnings
--showhttperrors
. (With the options enabled, they will output asTYPE_WARNING
andTYPE_ERROR
.)TYPE_OBJECTS
,TYPE_TIMERS
(full debug logging is now1023
)202306121848
:curl, exif, get, put, mbstring, fileinfo, mimetype, gd, imagemagick, gmagick, hrtime
. If an extension is listed as true in this section but is not loaded or available, it will change to false. (Please note, GD, ImageMagick and gmagick are not currently used at all.)--sites
as an alternate to--list
Some notes on this:
Having our own image identification is important should the PHP installation be limited (for whatever reason) and going by file extension is still the last resort.
The method used for this is looking for the "signature" of the image file. Most image formats have a header with signature data to be used by software trying to open it (this is also called a "magic number".) The new code knows PNG, GIF, JPEG, WEBP, BMP and ICO formats.
Some image formats are easier to identify than others, for example PNG format's "magic" which is
\x89PNG\r\n\x1A\n
which is pretty good. BMP and ICO have very very simple identifiers and so having false positives is much more likely which is why I've been adding a "certainty" rating. Eventually you'll be able to set a minimum acceptable "certainty" and reject possibly invalid files. (You can currently set it but nothing looks at it.)Here's some sample trace logging showing this in action:
2023-06-12 18:47:21 [TRACE] [grap_favicon(20):listIcons:getMIMETypeFromFile] pathname='icons/whatsapp.png', content_type=image/png, confidence=certain, method=signature
Ideally, if everything is available to get-fav.php the following methods are used, in order:
FileInfo
mime_content_type
(local files only)exif_imagetype
(andimage_type_to_mime_type
if available)getMIMETypeFromBinary
(the new fallback function using "magic")202306071311
:--checklocal
/--nochecklocal
,--storeifnew
(requires --checklocal and --store) ( Not implemented yet. )--showconfig
/--noshowconfig
to show running configuration options--showconfigonly
(implies--showconfig
), shows running configuration and exits.--silent
(console mode only) (turns off the console completely)ENABLE_SAME_FOLDER_INI
andENABLE_SAME_FOLDER_API_INI
. They default tofalse
. If they are set totrue
, ifget-fav.ini
andget-fav-api.ini
, respectively, are in the same folder asget-fav.php
they will be read and used automatically.--configfile
and--apiconfigfile
, if specified, will be applied after.It will likely be a few days before I do another
git push
as the next one is a big one:storeifnew
is enabled it will be replaced)202306062230
:file_get_contents
is not available, check if PHP.INI:allow_url_fopen
is disabled, if so show an error message.202306042323
:--apiconfigfile=PATHNAME
to load API Definitions202306021445
:202306011529
:--allowoctetstream
/--disallowoctetstream
, the default isfalse
because if the more accurate content-type detection is not available most will returnapplication/octet-stream
. I may make the default true if andmime_content_type
and/orfinfo_open
are available. (.ini file is[global] allow_octet_stream=boolean
)202305312016
:202305281757
:202305251719
:202305242106
:get-fav-api.ini
)202305241803
:202305241420
:202305221634
:parse_ini_file
withINI_SCANNER_RAW
, doesarray_replace_recursive
with the existing configuration structure and finally validates boolean/numeric (with range checks).)202305231619
:Stuff being worked on:
(I'm keeping my github fork up to date as I work on stuff, assuming it's not throwing horrible errors.)
--checklocal
option will check the icon in the local path first and check online only if missing or otherwise invalid (size, type, blocklist). (in progress)configuration.md
for detailed help on options.defines
for easier maintenance.--disableapis=google,faviconkit
)microsoft.com.ico
becomesmicrosoft.ico
)--version
(akav
andver
)define
$debug
to a bool--user-agent
is passed in.writeOutput
)Issues:
--help
output takes more than one standard console screen (| more
or| clip
need to be used)Before pull request:
Other Tasks:
Notes:
define
block at the top,