wxWidgets / wxWidgets

Cross-Platform C++ GUI Library
https://www.wxwidgets.org/
5.78k stars 1.7k forks source link

wxURI #6789

Closed wxtrac closed 2 years ago

wxtrac commented 19 years ago

Issue migrated from trac ticket # 6789

priority: normal

2004-08-08 10:49:54: ryannpcs created the issue


Generic URI class.

Includes some documentation, and the usual modifications to files.bkl.

Note that you need to add uri.tex to classes.tex.

wxtrac commented 19 years ago

2004-08-08 17:03:08: @vslavik commented


Thanks! Some comments based on a brief (i.e., not in-depth) look at the patch:

wxtrac commented 19 years ago

2004-08-08 20:15:21: ryannpcs commented


Thanks for looking at it Vaclav.

1) I made some of the utility functions public (Encode, etc.) because I thought they might be useful to applications that use wxHTTP, for instance. If you still think they should be moved then I'll do so.

2) I don't use wxStrings for several reasons -

2a) A component may be undefined, not just empty (Could use a bit field, but then that makes the implemented more obfuscated)

2b) The parsing functions are much more efficient with char* (I use memcpy all over the place - using GetWriteBuf() that many times seems inefficient)

Mostly because of 2a - instead of checking for NULL clients would have to do something like

if(uri.IsDefined(wxURI::SCHEME)) wxPrintf(uri.GetScheme());

Which seems less intuitive

if(uri.GetScheme()) wxPrintf(uri.GetScheme());


Anyway let me know what you think of the above 2 things.
In the mean time I'll throw in the other things you mentioned.

Thanks, Ryan

wxtrac commented 19 years ago

2004-08-08 22:12:08: ryannpcs commented


Updated documentation attached - let me know if this is more along the lines of what you want.

BTW concerning the string problem what would be nice would be to create a string, passing the size to create it in its ctor, and then accessing the string data directly.

wxtrac commented 19 years ago

2004-08-09 00:56:22: ryannpcs commented


OK - Here's the patch that takes into account what you noted.

Also, I converted the class members to wxStrings liked you asked - but I don't know if that was a good idea or not.

Thanks! Ryan

wxtrac commented 19 years ago

2004-08-09 23:01:00: @vslavik commented


1) I made some of the utility functions public (Encode, etc.) because I thought they might be useful to applications that use wxHTTP, for instance. If you still think they should be moved then I'll do so.

I think, because they seem to be of very limited use to me and once a function becomes part of public API, we have to keep it for backward compatibility.

2a) A component may be undefined, not just empty (Could use a bit field, but then that makes the implemented more obfuscated)

When can this happen?

2b) The parsing functions are much more efficient with char* (I use memcpy all over the place - using GetWriteBuf() that many times seems inefficient)

I don't understand how is allocating the memory yourself any more efficient than using GetWriteBuf (or better, wxCharBuffer)...

if(uri.IsDefined(wxURI::SCHEME)) wxPrintf(uri.GetScheme());

Which seems less intuitive

It seems more intuitive to me, it explicitly says what you're interested in. It would be nice to have the API consistent, BTW: either IsDefined(wxURI::SCHEME) and GetPart(wxURI::SCHEME) or (my preference) HasScheme() and GetScheme().

But I think that using wxString is a no-brainer, because a) we use it everywhere in the wx API and b) it is much easier to manage, you don't have to worry about memory deallocation (or exceptions).

Also, I converted the class members to wxStrings liked you asked - but I don't know if that was a good idea or not.

There are still lots of places in the wxURI API that use char_type*, not wxString...

Updated documentation attached - let me know if this is more along the lines of what you want.

It would be nice to have more detailed introduction in plain English explaining what the class is for and where's it used (its relation to wxURL is no doubt going to be a great source of confusion).

Some nitpicking: I think wxURI::Create is not good place for URI examples, I would prefer a short introduction at the top of class docs with a pointer(s) to relevant RFC(s); IsDefined's argument values are not described (but see above about IsDefined), formatting is wrong in quite a few places (using 1)..., 2)... bullets without using appropriate LaTeX commands creates a single-paragraph mess), URLs should probably be inside {\tt }, you should be using \arg, \true, \false, \NULL etc. macros, "URI" and not "uri", that kind of things.. Generally speaking ,it shouldn't look out of place if you compare generated HTML to, say, wxString documentation.

Some more things:

wxtrac commented 19 years ago

2004-08-09 23:45:52: ryannpcs commented


Hi Vaclav,

1) I made some of the utility functions public (Encode, etc.) because I thought they might be useful to applications that use wxHTTP, for instance. If you still think they should be moved then I'll do so.

I think, because they seem to be of very limited use to me and once a function becomes part of public API, we have to keep it for backward compatibility. Well, Encode/Decode are part of the uri standard - but I agree they should be changed to use wxStrings. Normalize can probably go though.

2a) A component may be undefined, not just empty (Could use a bit field, but then that makes the implemented more obfuscated)

When can this happen? Here's a simple example:

../mysite

in the above URI the authority (server, user etc.) is undefined, whereas in the following URI:

http://mysite

the authority is empty (""), but is defined

2b) The parsing functions are much more efficient with char* (I use memcpy all over the place - using GetWriteBuf() that many times seems inefficient)

I don't understand how is allocating the memory yourself any more efficient than using GetWriteBuf (or better, wxCharBuffer)... In retrospect I think you're probably right... I'll redo it with wxCharBuffer

if(uri.IsDefined(wxURI::SCHEME)) wxPrintf(uri.GetScheme());

Which seems less intuitive

It seems more intuitive to me, it explicitly says what you're interested in. It would be nice to have the API consistent, BTW: either IsDefined(wxURI::SCHEME) and GetPart(wxURI::SCHEME) or (my preference) HasScheme() and GetScheme(). Good idea - I think the latter is better too.

Updated documentation attached - let me know if this is more along the lines of what you want.

It would be nice to have more detailed introduction in plain English explaining what the class is for and where's it used (its relation to wxURL is no doubt going to be a great source of confusion). OK.

Some nitpicking: I think wxURI::Create is not good place for URI examples, I would prefer a short introduction at the top of class docs with a pointer(s) to relevant RFC(s); OK - most other classes have a short intro, so that's why I put it there...

IsDefined's argument values are not described (but see above about IsDefined), formatting is wrong in quite a few places (using 1)..., 2)... bullets without using appropriate LaTeX commands creates a single-paragraph mess), URLs should probably be inside {\tt }, you should be using \arg, \true, \false, \NULL etc. macros, "URI" and not "uri", that kind of things.. Generally speaking ,it shouldn't look out of place if you compare generated HTML to, say, wxString documentation. Thanks for the reference. BTW, what does \arg do?

  • why do you use "regname" for "undefined component"? I had to read through the sources to understand what is "regname" supposed to mean (and I'm still not 100% sure I got it right) The server component can be either an ip address (of varying versions) or a regname.

For example an app might do the following:

if(REGNAME) myaddr.ConnectByHostname(uri.GetServer()); else myaddr.ConnectByIP(uri.GetServer());

  • we need (as extensive as possible, as always) unit test for this class, to ensure it works and will work correctly... Right.

  • I assume the decision to leave password as part of "user" component and not parse it was deliberate in line with RFC's recommendation? Yep - no password in URIs :). In fact the w3c devotes a couple of paragraphs to reasons why it should not be used :).

  • Shouldn't wxURL use this class [in the future]? One idea might be to make the parse functions protected and then have wxURL derive from wxURI and parse according to the older URL RFC (It parses the username differently to account for the password, for example). However, it sounds like the w3c is trying to phase out URLs in preference to URIs (indeed they will soon put in specifications for http/etc. URIs), so mayby its just better to phase out wxURL altogether.


BTW, as a request mayby we could upload this to the tree without changing the bakefiles, and move the discussion to wx-dev? Otherwise it takes me 1.5-3 hours just to make/edit a diff to submit.

Thanks! Ryan

wxtrac commented 19 years ago

2004-08-09 23:49:19: ryannpcs commented


Whoops - http://mysite should be http:///mysite :)

wxtrac commented 19 years ago

2004-08-10 06:08:41: ryannpcs commented


OK, I've attached a new version with everything you suggested.

3 questions though - 1) Do we need an introduction to URIs in the docs? I wrote a partial one then ditched it because RFC 2396 does it much better job of explaining them

2) What's missing from the new version of the docs?

3) Do we want wxURI to maintain the Full URI within wxURI so each call to Create doesn't ential rebuilding the URI, and force a wxString to be passed to wxURI::Create?

Anyway - that's it. As you suggested there is also a unit test in console.cpp.

Thanks! Ryan

wxtrac commented 19 years ago

2004-08-14 22:19:07: @vslavik commented


Sorry for the delay with replying and thanks for updated patch!

Well, Encode/Decode are part of the uri standard - but I agree they should be changed to use wxStrings. Normalize can probably go though.

I have no idea what Encode and Decode do from your comments in the header, so I can't tell, but Normalize sound's like something that should be either included or always done automatically. What I had in mind were clearly private functions like Parse or all of ParseXXX, IsEscape etc.

Thanks for the reference. BTW, what does \arg do?

When you're writing something about method's argument foo, write it as \arg{foo}, it will format "foo" in italics (by default) to make it clear it's an argument.

However, it sounds like the w3c is trying to phase out URLs in preference to URIs (indeed they will soon put in specifications for http/etc. URIs), so mayby its just better to phase out wxURL altogether.

IMHO if the class is going to be useful for any serious work, it has to support URLs according to older RFCs -- I think the goal should be to make it easy to parse URIs/URLs that apps may encounter, rather than to implement RFC 2396.

BTW, as a request mayby we could upload this to the tree without changing the bakefiles, and move the discussion to wx-dev? Otherwise it takes me 1.5-3 hours just to make/edit a diff to submit.

I'm opposed to uploading unfinished patch to CVS. I don't understand the part about how long it takes you to make a diff: running diff on two directories is a matter of minutes at most. There are only five files in your patch: even copying them over to another directory structure and typing the diff command can't take more than five minutes...

1) Do we need an introduction to URIs in the docs? I wrote a partial one then ditched it because RFC 2396 does it much better job of explaining them

You're right -- I think a link to RFC would be fine.

2) What's missing from the new version of the docs?

Links should be links, there are still formatting errors (such as _is_ instead of {\em is}), "uri" instead of "URI" (mixing both is even worse than using "uri" everywhere, not using {\tt ...} for literals or small inlined examples). There are undefined references (wxuriisdefined -- is it IsDefined or GetIsDefined) -- please generate HTML version of the docs and proof-read it! It's unclear to me what's the difference between GetPort and GetPortValue -- as far as I can tell, the former is useless (port is always integer, isn't it?) and the latter should be called GetPort. If both are needed, add "See also" links (this should be done for all methods that are related, e.g. HasXXX and GetXXX). GetXXX is missing explanation of which part of the URI given component is -- I for sure don't immediately know what "query" is, I shouldn't be required to read the RFC to be able to use the class, it should be explained either in methods' or class' documentation.

The documentation IMHO doesn't need to mention that you can use wxURI to validate URIs, that's obvious -- OTOH, it should provide IsOk() method in addition to Create's return value. Create should definitely only accept valid URIs as input and not strings that begin with URI (and thus, it should return bool) -- if you need a function to find URI in string, add it as another method. There's no point in documenting destructors -- everybody who uses C++ knows what dtors do.

What do you mean by "it's recommended to call Create only once"? Either calling it multiple times does work, in which case there should be no such comment, or it doesn't, in which case it should assert (or not exist at all and have only ctors).

Please follow wx's coding style: members should be m_fooBar, not mFooBar, function arguments shouldn't have m prefix, nontrivial functions should not be inline.

The public API still uses const char* in at least one place (ctor), please fix it. You've made a lot of useful things protected, too, such as operator=(=) -- by accident, I assume?

3) Do we want wxURI to maintain the Full URI within wxURI so each call to Create doesn't ential rebuilding the URI, and force a wxString to be passed to wxURI::Create?

Huh?

Anyway - that's it. As you suggested there is also a unit test in console.cpp.

Please have a look at tests in the tests/ for what I meant by "unit test". In short, it's test of class' methods written using CppUnit framework. Preferably, unit tests would cover all methods and both typical and corner cases -- isn't there some RFC 2396 test suite available?

wxtrac commented 19 years ago

2004-08-15 00:31:09: ryannpcs commented


Hi Vaclav,

Sorry for the delay with replying and thanks for updated patch! Thanks for the helpful replies!

IMHO if the class is going to be useful for any serious work, it has to support URLs according to older RFCs -- I think the goal should be to make it easy to parse URIs/URLs that apps may encounter, rather than to implement RFC 2396. Err... then when should I parse it as a URI? The obvious way is to parse it according to it's scheme/protocol - however the older URL RFC defines a spec for generic URLs also... i.e. the question is should I just parse it according to the protocol, then if a protocol isn't found in wxProtoInfo, just parse it as a generic URI?

Not only this, but wxURL doesn't appear to do any validating at all - it just assumes the URL is valid to begin with....

In addition, I could put in some API compatability with wxURL if desired (I'm assuming SetProxy does roughly the same thing as Resolve()).

I have no idea what Encode and Decode do from your comments in the header, so I can't tell, but Normalize sound's like something that should be either included or always done automatically. What I had in mind were clearly private functions like Parse or all of ParseXXX, IsEscape etc. Right - the updated patch puts all those protected and normalizes the path in Create.

When you're writing something about method's argument foo, write it as \arg{foo}, it will format "foo" in italics (by default) to make it clear it's an argument. Thanks - is this preferred over \param?

You're right -- I think a link to RFC would be fine. OK.

Links should be links, there are still formatting errors (such as _is_ instead of {\em is}), "uri" instead of "URI" (mixing both is even worse than using "uri" everywhere, not using {\tt ...} for literals or small inlined examples). There are undefined references (wxuriisdefined -- is it IsDefined or GetIsDefined) -- OK - I found \urlref also... for the inlined examples I use \begin{verbatim} etc..

please generate HTML version of the docs and proof-read it! I did :).

It's unclear to me what's the difference between GetPort and GetPortValue -- as far as I can tell, the former is useless (port is always integer, isn't it?) and the latter should be called GetPort. If both are needed, add "See also" links (this should be done for all methods that are related, e.g. HasXXX and GetXXX). GetXXX is missing explanation of which part of the URI given component is -- I for sure don't immediately know what "query" is, I shouldn't be required to read the RFC to be able to use the class, it should be explained either in methods' or class' documentation. Makes sense. BTW the port is stored internally as a string - GetPortValue calls wxAtoi on it...

The documentation IMHO doesn't need to mention that you can use wxURI to validate URIs, that's obvious -- OTOH, it should provide IsOk() method in addition to Create's return value. Create should definitely only accept valid URIs as input and not strings that begin with URI (and thus, it should return bool) -- if you need a function to find URI in string, add it as another method. There's no point in documenting destructors -- everybody who uses C++ knows what dtors do. OK.

What do you mean by "it's recommended to call Create only once"? Either calling it multiple times does work, in which case there should be no such comment, or it doesn't, in which case it should assert (or not exist at all and have only ctors). OK - I'll clarify that.

Please follow wx's coding style: members should be m_fooBar, not mFooBar, function arguments shouldn't have m prefix, nontrivial functions should not be inline. I also make functions inline if they are only used internally in one place... but I'll un-inline them for now...

The public API still uses const char* in at least one place (ctor), please fix it. You've made a lot of useful things protected, too, such as operator=(=) -- by accident, I assume? Must be, sorry :).

Huh? Well, wxURL has a m_url member. Was wondering if there was a need for a corresponding m_uri member, etc. or if wxURI rebuilding the uri from the individual components from Get() was ok.

Please have a look at tests in the tests/ for what I meant by "unit test". In short, it's test of class' methods written using CppUnit framework. Preferably, unit tests would cover all methods and both typical and corner cases -- isn't there some RFC 2396 test suite available?

I see what you're talking about. However, I didn't see any explicit tests in RFC 2396, just some examples - all of which I included in the console example. Googling/searching didn't turn up much either... However, I will make a unit test according to CppUnit if needed based off the tests I put in the console example (and possibly some more I can think up).

wxtrac commented 19 years ago

2004-08-15 08:41:34: ryannpcs commented


Updated patch attached.

I added unit tests, made IsOk(), and several other things.

Things I didn't do: 1) change /m./ to /m_./ in the class and parsexxx methods (as noted on the ml, this is based off of one of my projects which follows a different coding style... so there's bound to be a couple of these...) 2) GetPort/GetPortValue ambiguity... (I'm still not sure what you want. At least from the docs, it's clear that GetPort() returns a string and GetPortValue returns a size_t) 3) const char* (I didn't find any.... char_type is wxChar... mayby you wanted a const wxString& passed instead?)

Wishlist/Noncritical Things: 1) A couple punctuation errors in the docs 2) Document query etc. in GetXXX function docs 3) Have wxURL inherit from wxURI and validate according to older RFC....

wxtrac commented 19 years ago

2004-08-15 13:05:31: ryannpcs commented


I had a little more time to iron out some edges...

1) Corrected member/function variable names to be more in line with wx specs 2) Use wxStrings instead of wxChars 3) Got rid of GetPartValue (seems to cause more confusion that it helps) 4) Quite a bit of changes to docs, a little info in get methods on URI components 5) More unit tests (and more I can't recall)

The only thing that appears to need to be done is the issues with wxURL (not exactly critical), and mayby Extract (I can't seem to come up with a better function name :)).

Ryan

wxtrac commented 19 years ago

2004-10-09 14:26:24: @vslavik commented


2) GetPort/GetPortValue ambiguity... (I'm still not sure what you want. At least from the docs, it's clear that GetPort() returns a string and GetPortValue returns a size_t)

But the meaning of the returned value is not clear. Either it's the same thing, in which there should be only one function, or they return something different, in which case the docs has to explain what the values mean.

Once again, the utility functions shouldn't be publicly visible (they're protected, which means they are visible in derived classes). The static ones shouldn't be in the header at all, I think, they should be hidden in .cpp file.

I think it's very confusing to have wxURI::char_type, size_type and buffer_type. Just use the standard types like wxChar and size_t.

and mayby Extract (I can't seem to come up with a better function name :)).

I still don't like its present form -- it initializes the wxURI object and returns some value, it should do only the latter (and be static). This is too confusing.

3) const char* (I didn't find any.... char_type is wxChar... mayby you wanted a const wxString& passed instead?)

Yes, everywhere in public (i.e., non-private) API.

When you're writing something about method's argument foo, write it as \arg{foo}, it will format "foo" in italics (by default) to make it clear it's an argument. Thanks - is this preferred over \param?

\param is something else altogether, it's used to write the list of parameters with their descriptions; \arg{} is used when you refer to a parameter in text.

One more thing: it would be more in line with rest of wx to not use wxURI::SCHEME etc., but use either wxURI_SCHEME or wxURI::Part_Scheme (and same with HostType: wxURI::Host_RegName); IIRC we don't use capitalized constants that are within class' namespace...

Thanks! Vaclav

wxtrac commented 19 years ago

2004-10-10 22:08:56: ryannpcs commented


2) GetPort/GetPortValue ambiguity... (I'm still not sure what you want. At least from the docs, it's clear that GetPort() returns a string and GetPortValue returns a size_t)

But the meaning of the returned value is not clear. Either it's the same thing, in which there should be only one function, or they return something different, in which case the docs has to explain what the values mean.

OK - in this version of the patch there already is only one version (the string version).

Once again, the utility functions shouldn't be publicly visible (they're protected, which means they are visible in derived classes). The static ones shouldn't be in the header at all, I think, they should be hidden in .cpp file.

Well, I did that on purpose because they could be used in, say, wxURL or wxURN if we decided to do something like that.

Anyway, if you still disagree I can move the function to the implementation file.

I think it's very confusing to have wxURI::char_type, size_type and buffer_type. Just use the standard types like wxChar and size_t.

OK.

and mayby Extract (I can't seem to come up with a better function name :)).

I still don't like its present form -- it initializes the wxURI object and returns some value, it should do only the latter (and be static). This is too confusing.

Good idea.

3) const char* (I didn't find any.... char_type is wxChar... mayby you wanted a const wxString& passed instead?)

Yes, everywhere in public (i.e., non-private) API.

OK.

IIRC we don't use capitalized constants that are within class' namespace...

We do for wxThread.

I'll move the public enum outside the class and prefix it with wxURI_. The other enum (with SCHEME, etc.) is only used within wxURI itself...

Thanks! Ryan

wxtrac commented 19 years ago

2004-10-10 23:33:42: ryannpcs commented


OK - updated version attached!

wxtrac commented 19 years ago

2004-10-21 04:47:50: ryannpcs commented


Removed since it is incorrect. Will come back to it at a later date.