Handle trailing dots in references

dsheets commented 9 years ago

Either fail as early as possible, move them outside the reference, or ignore them.

Right now, nothing happens and they bubble up through XML which then fails mysteriously with a syntax error.

lpw25 commented 8 years ago

Should be fixed by https://github.com/lpw25/doc-ock/commit/8ac151b31153a30b60d9e53e52a2b4e43574b8a4

dsheets commented 8 years ago

I considered something like this but I don't think it catches all invalid references and any that get passed downstream will get tagged with resolution errors anyway.

Also, there are serialization/parse errors for things like (**@author*) which have empty bodies. The serializer patch in lpw25/doc-ock-xml#14 addresses those oddities, too.

lpw25 commented 8 years ago

I considered something like this but I don't think it catches all invalid references and any that get passed downstream will get tagged with resolution errors anyway.

We should probably try to catch all invalid references at source, but for now I'm happy for doc-ock to just want names (i.e. the things between dots) to be non-empty.

Also, there are serialization/parse errors for things like (*@author) which have empty bodies. The serializer patch in lpw25/doc-ock-xml#14 addresses those oddities, too.

I would rather go through each of these cases and turn them into errors in doc-ock, since they are definitely documentation errors. Empty strings are a pain because they screw up error messages, which can make it difficult for users to identify the problem.

lpw25 commented 8 years ago

I would rather go through each of these cases and turn them into errors in doc-ock, since they are definitely documentation errors.

Having said this, it is not like we check all invariants when we read in the XML, so I suppose we might as well merge https://github.com/lpw25/doc-ock-xml/pull/14 as well.

dsheets commented 8 years ago

I am writing an indexing/error detection pass for codoc right now. One thing I like about that layer is that I can detect errors at a smaller granularity than per comment so the non-error parts of a comment can still render. More errors at the doc-ock layer would be good, too, and I agree that whatever passes through the XML interfaces needs to be relatively untrusted (so the whole parser shouldn't fall over on missing data signals).

lpw25 commented 8 years ago

One thing I like about that layer is that I can detect errors at a smaller granularity than per comment so the non-error parts of a comment can still render.

I hope to do a similar thing for the octavius parser at some point because the current approach of throwing away the whole comment is very unsatisfactory. Of course when octavius has been made more general and less doc-ock specific, and doc-ock is made polymorphic in the documentation type, then the errors that we are talking about will be handled by codoc anyway.

I agree that whatever passes through the XML interfaces needs to be relatively untrusted (so the whole parser shouldn't fall over on missing data signals)

I am unsure exactly what to do about this in general. If there are missing fields in the XML then it is not possible to produce a valid doc-ock unit, and I'm not sure I see the value in representing an invalid unit. It would probably help if the XML printer did the same checks as the XML parser, that way invalid XML should mean that the XML has been corrupted and so a fail of the whole parse is reasonable behaviour.

dsheets commented 8 years ago

Parse failure is reasonable for a lot of cases. I'm thinking specifically of issues like this where xmlm does not give you an empty data signals. A less internally inconsistent example would be accepting unparsed references (<ref>Foo.bar</ref>) but this is not needed (and not a toolchain error) at the moment.

ocaml-doc / doc-ock

Handle trailing dots in references #48