digininja / CeWL

CeWL is a Custom Word List Generator
1.96k stars 258 forks source link

undefined method `scheme' for nil:NilClass in cewl 6.2 #121

Closed ZeroChaos- closed 2 months ago

ZeroChaos- commented 3 months ago
Offsite link, not following: https://twitter.com/intent/tweet?text=Lovely article, well done!

Couldn't access the site (https://www.gateworld.net/)
Error: #<NoMethodError: undefined method `scheme' for nil:NilClass

                    if (parsed_url.scheme == "http" or parsed_url.scheme == "https") then
                                  ^^^^^^^>
Error: ["/usr/bin/cewl:172:in `block (2 levels) in start!'", "/usr/bin/cewl:171:in `select'", "/usr/bin/cewl:171:in `block in start!'", "/usr/bin/cewl:163:in `each'", "/usr/bin/cewl:163:in `start!'", "/usr/bin/cewl:115:in `start_at'", "/usr/bin/cewl:784:in `block in <main>'", "/usr/bin/cewl:774:in `catch'", "/usr/bin/cewl:774:in `<main>'"]
digininja commented 3 months ago

That's interesting, that isn't the latest code you are running, the latest looks like this on line 172:

if (parsed_url.scheme == "mailto" or parsed_url.scheme == "http" or parsed_url.scheme == "https") then

But 6.2 is the latest.

I could put in a fix to check if parsed_url is nil and fail gracefully, but I'd like to know how it ended up as nil. I've just scanned https://www.gateworld.net/ successfully. Is this a reproducible failure for you? If so, can you run it with debug and get the url/thing it is failing on.

ZeroChaos- commented 3 months ago

I suspect there was a transient network failure:

Couldn't access the site (https://www.gateworld.net/)

Would that cause an issue like this maybe?

digininja commented 3 months ago

The error came from a function that parses a URL into its bits and then I take the protocol from. I can't think of a time when it would have something to parse that didn't have a protocol as it expands relative to full.

I'll put the hacky fix in which will work, but I'd like to know that caused it so I can put a proper fix in.

On Thu, 18 Jul 2024, 17:35 Zero_Chaos, @.***> wrote:

I suspect there was a transient network failure:

Couldn't access the site (https://www.gateworld.net/)

Would that cause an issue like this maybe?

— Reply to this email directly, view it on GitHub https://github.com/digininja/CeWL/issues/121#issuecomment-2237043909, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA4SWOFF6HDNWVLB4ZVNJLZM7VGRAVCNFSM6AAAAABKPRYVIWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZXGA2DGOJQHE . You are receiving this because you commented.Message ID: @.***>

digininja commented 2 months ago

Fixed it with this commit https://github.com/digininja/CeWL/commit/869f68f8b8d44e1e44d2eef1c4a91883d3ad2ff4

I'd still like to know what caused it, but this will do for now.