ndmitchell / hoogle

Haskell API search engine
http://hoogle.haskell.org/
Other
738 stars 134 forks source link

json repsonse contains html markup #260

Closed dispanser closed 6 years ago

dispanser commented 6 years ago

Using hoogle 5, the json response contains html markup and escape sequences (&gt).

E.g., when querying https://hoogle.haskell.org/?mode=json&hoogle=fmap&count=1, formatted json response is:

[
  {
    "url": "https://hackage.haskell.org/package/base/docs/Prelude.html#v:fmap",
    "module": {
      "url": "https://hackage.haskell.org/package/base/docs/Prelude.html",
      "name": "Prelude"
    },
    "package": {
      "url": "https://hackage.haskell.org/package/base",
      "name": "base"
    },
    "item": "<span class=name><0>fmap</0></span> :: Functor f =&gt; (a -&gt; b) -&gt; f a -&gt; f b",
    "type": "",
    "docs": ""
  }
]

I think that the default behavior shouldn't contain any markup, as it's supposed to be consumed by a machine.

ndmitchell commented 6 years ago

The HTML should be quite readable to a machine, not too hard to strip out. The information about coloring etc seems valuable to many consumers of the API, which is why I included it. Suggestions?

dispanser commented 6 years ago

Thanks for your response. From my perspective, adding html markup is easier than removing it, but if you prefer the current approach (I assume it's used somewhere already in its current form), I'll just keep stripping the markup off.

For reference, I'm coming from https://github.com/gibiansky/IHaskell/pull/825 which attempts to adapt ihaskell kernel to the new hoogle.

Closing

ndmitchell commented 6 years ago

The HTML markup actually makes it into the Hoogle databases at the end, so gets added (and has to be added) offline, before the results are generated. Removing it would require some post processing, which is no harder or easier on your side. I confirm it is used several times in it's current form. Note that the markup also says which arguments map to which, which might be valuable in IHaskell at a future date.