jankammerath / gophie

Gophie is a modern, graphical and cross-platform client or browser for "The Internet Gopher" also known as the Gopher protocol. Gophie supports browsing gopher pages, using search engines such as Veronica-2, displaying images and downloading files.
http://gophie.org
GNU General Public License v3.0
183 stars 10 forks source link

Use item types in Gopher URLs #14

Closed dotcomboom closed 4 years ago

dotcomboom commented 4 years ago

image

Gopher urls around the internet tend to use a scheme in the form of gopher://{host}:{port}/{type}/{selector} (?{query} is appended in the case of search results, though Gophie does handle this already). Entering these urls currently make Gophie send a selector called /1/ instead of ` (none) or/`, bringing up a not found error on most hosts. Support for this form factor would be beneficial as it would be more compatible with the urls other clients use.

Also, at the moment (under Gophie 1.0) if I enter the URL gopher://gopher.somnolescent.net/aphrodisiac/aphrodisiac_-_01_this_nadir.wav, it will (as far as I can tell) try to load it as a menu and hang up. This would be resolved without any guesswork on the client's end by supporting and using item types in the URL, like gopher://gopher.somnolescent.net/s/aphrodisiac/aphrodisiac_-_01_this_nadir.wav (where s is the type used for wave files). It could then show the download prompt depending on if the type is text-based or not.

(As for why URLs in the form of gopher://{host}:{port}/{selector} aren't typically used, it's a matter of lessening the guesswork the client has to make. Not having the type in the URL itself would mean that I could have a file at gopher://example.com/file and the client would not know whether it is a directory, text file, generic binary, image, etc.. With HTTP this is a nonissue, as headers and MIME types could be used to tell the client what it will be receiving, but in the case of Gopher which is a headerless protocol it would be left up to the client to try and figure out what kind of file it is, going against the philosophy of the intelligence being held by the server as in RFC 1436.)

I might've overwrote/overstated a bit but feel free to let me know if you have questions, since I'm really liking this client and it's looking to be one of the best current graphical ones I've used. Cheers!

jankammerath commented 4 years ago

Just to clarify the issue with you: Gophie treats all files, inserted into the address bar, as GopherMenu. This causes Gophie to misbehave when inserting URLs directly into the address bar that are pointing to files that are not GopherMenu types.

The reason for that behaviour is that Gophie's "addressRequested"-handler in the MainWindow receives a new GopherItem with a GopherMenu by default and does not check for the possible file type associated with it.

Also you can not enter search requests or queries directly as the address bar does not support inserting tabs at the moment other than through copy and paste.

The issue with the "/1/" in the URL is something I have seen before, but can you clarify a little on when that exactly happens?

dotcomboom commented 4 years ago

https://github.com/jankammerath/gophie/pull/15 I have made a pull request with some of my fixes. image This file's /0/ part, for example, is the item type. You could check what the file type is with the parsed URL's typeCode. Right now I have it working for menus and text files in the edits I've made to the GopherPage class, though this should be checked before making a request in the case of binaries.

mariteaux commented 4 years ago

Hi, can replicate the bug. To clarify what the issue is (IMO), Gophie can handle a binary file in a menu because it gets the item type from the menu itself and can handle it however it will. There's no item type in the URL itself, though, which not only makes URLs with item types incompatible (treats the item type as part of the path), but it also tries to parse every arbitrary URL as a menu. Most all clients I've seen use item types in their URLs; my preferred one, Netscape 4.5, is a good example. Pasting a URL from it into Gophie returns an error.

As evidence of this, I'm thinking the hang is due to the size of the file overloading something in Gophie's menu parser. Going to gopher://gopher.somnolescent.net/aphrodisiac/aphrodisiac_-_01_this_nadir.wav hangs the program for me, but going to, say, gopher.somnolescent.net/pennyverse/concept/colton.gif, which is a much smaller file, doesn't hang it, but it does spit out a glitched menu.

I definitely think the easiest way around this would be to allow the menu type to be specified in the URL (the /1/ bit) and parse that instead of checking for anything with the file directly. There's far too many file formats for Gophie to be able to check them all, even just counting common binary formats for images, sound, documents, archives, and programs, but parsing it in the URL leaves it up to the user to handle the specific file correctly.

jankammerath commented 4 years ago

@mariteaux that issue for the binary files should be fixed now with commit b1f2a66.

grafik

Same for the mentioned GIF-file. Gophie will check for popular file headers after it received the first few hundred bytes and determine what to do with it. For binary files it will prompt for a download and for image files, it'll just show them.

jankammerath commented 4 years ago

@dotcomboom @mariteaux I am a bit hesitant with regards to the implementation for the item type in the URL as I cannot find anything about it in RFC 1436. Have you found it somewhere in there?

It would also have the consequence that any possible gopher server using "/(a-z0-9)/" like "/s/sabrina.txt" in their selector would have trouble with Gophie as it would try to handle it as something it might not be.

I do get your point as I have now tested it with Netscape 4.8 myself. I'll have a look into the Netscape source code on the weekend to try and figure out how they implemented it.

mariteaux commented 4 years ago

RFC 1436 says nothing about URLs, but every other client I've used, including Netscape, Lynx, Camino, and Overbite all encode the item type in the URL in this manner. So while it isn't a part of the standard, it does confuse matters between clients. We first encountered this issue when I pasted a link from Netscape into IRC and dcb pasted it into Gophie.

For what it's worth, I've never seen a Gopher server in the wild that uses directories in that manner, and even if they did, the url would be /0/s/sabrina.txt. Other clients would be able to handle this fine.

jankammerath commented 4 years ago

I think you're right. I'll make it configurable and turned on by default.

dotcomboom commented 4 years ago

I would like to correct myself on the URL scheme. The Gopher URLs I described actually fall under this RFC, where the / portion would be part of the selector itself.

   A Gopher URL takes the form:

      gopher://<host>:<port>/<gopher-path>

   where <gopher-path> is one of:

      <gophertype><selector>
      <gophertype><selector>%09<search>
      <gophertype><selector>%09<search>%09<gopher+_string>

(Although, ? instead of %09 for an encoded tab is common practice these days.)

Parsing the type would be a matter of taking out the first character of the <gopher-path>. The rest would be the selector. Most modern Gopher servers have / at the start of their selectors, and that's what got me confused about the form of it all.