Closed spyoungtech closed 7 years ago
@spyoungtech, thanks for the sample case. Definitely an issue handling the greater-than/less-than characters. Both REST and XML-RPC will fail.
I tried making a quick fix escaping some characters, however contentbody/convert/storage
(and convertWikiToStorageFormat
) appear to unescape the data again before a storage request is made. So, there may be a need to two-passes to escape some characters (from just a quick look at this). When this issue can be looked at, it would be good to check for additional characters which could cause this issue as well.
thanks for filing the issue @spyoungtech I'll hopefully get around to reviewing this next week
This is a tricky one, I'm reading the spec and <
, >
are reserved characters so it's not clear whether your document is even valid RST
http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#inline-markup-recognition-rules
I'm reluctant to just put a blanket replace
method in for those characters since they can be used for other signs and it'll just create new bugs.
Another option could be to use a replace method only within the Sphinx builder for paragraphs. I don't know if that helps you because it's a list item
Thanks for the response @tonybaloney -- I hadn't considered that it may technically be invalid rst though I'm not entirely clear from reading the spec, myself. Sphinx does, however, process this material fine, for example, when building to html, like for readthedocs.
When built to html, the 'offending' <
and >
appear to change to <
and >
respectively.
<a class="reference external" href="http://pythonhosted.org">http://pythonhosted.org</a>/<project-name></p>
Based on some related issues, it looks like there are other troublesome characters that confluence does not like.
I agree that a blanket replace
is not the way to go. When I clear off some other things from my plate at work, I'll be able to use my paid time to read up on the source code, investigate, and hopefully contribute some code or ideas.
For your reference, I am trying to publish the material contained in the Brandon Rhodes sphinx-tutorial -- this content comes out of the 'handout' directory.
I've written a fix that replaces certain characters inside a paragraph. I'll push a PR and if you would be able to test it or share
this fix does not fix the reported issue :-(
I'm still not sure what Sphinx is supposed to do because you can't have triangular brackets in a URL. Looking at the Github renderer, it shows a URL and then a piece of text, so it separates the 2. @brandon-rhodes tutorial looks great and we should support it but I'm not sure what the behaviour should be
this is what github renders the RST as
<p><a href="http://pythonhosted.org">http://pythonhosted.org</a>/<project-name></p>
Thanks for pointing out the problem! Since those are not valid URLs, they should not be bare text in the RST document. I have just committed a fix that turns them into code literals, since they are samples and cannot literally be clicked on and followed. Let me know if this resolves the problem!
What Brandon says makes perfect sense. The URLs in this example should be literals, rather than bare text.
@tonybaloney this is the behavior I'm seeing now after pulling down the changes in #43
I see the text replacement in the produced .conf
file (this example is after pulling Brandon's changes to the tutorial repo)
{{http://pythonhosted.org/<project-name>}}
But then after it goes through the confluence storage converter, it appears that, as @jdknight mentioned earlier, confluence unescapes this in the storage format to
<code>https://readthedocs.org/projects/<project-name>/</code>
This is the same result prior to #43 (IE happens in both cases)
Therefore, it still throws the bad api error because of the unexpected "tag". What's interesting to me is that the API is returning invalid storage format content. To me, this seems more of a bug with Atlassian's API. Wondering if there is an open issue on this in Atlassian's bug tracker.
This leads me to believe that a sanitization step is needed after the storage format conversion, unless it's possible to produce some format that won't produce this problem when run through the conversion API...
I'm going to look at this some more this week.
Also worth mentioning that, if I take the product of the .conf
file (as produced before #43) and manually insert this markup into confluence, it works as expected. However, I noticed that it uses a tinymce rest endpoint to do the conversion at https://<BASE_URL>/rest/tinymce/1/wikixhtmlconverter
This does the right thing, whereas the Confluence REST API does not. I'm running Confluence Server 6.2.2
This issue should now be resolved in db5b916e57898eb86244b4c33a801f5da5c3b984 (available on the master; see specifically 255043422d25f8e437baf3182eafa6464a1ab288). To take advantage of this change, one can pull from master
. This change should be made available in an upcoming v0.7 release (#29).
I'll be marking this as closed as I believe the root issue has been resolved. If there's another issue I have missed, please feel free to bring it up in another issue.
I ran into an issue attempting to publish an rst document which, in relevant part, contains a list that has url's that contain placeholder text (IE
<placeholder>
) which resulted in aConfluenceBadApiError
exception.Checking the response returned by Confluence:
Sample of the RST content: