sol / hpack

hpack: A modern format for Haskell packages
MIT License
624 stars 103 forks source link

JSON schema #521

Open lf- opened 2 years ago

lf- commented 2 years ago

It would be super cool if there was a JSON schema for hpack, which would automatically enable IDE tools to provide autocompletion and checking of hpack files. I'm motivated for this by my terrible memory for the syntax of both cabal files and hpack files.

I am filing this as a good-first-issue kind of thing, not a request for you to write it necessarily.

sol commented 2 years ago

Sounds good. 👍

In an ideal world we would derive the schema and documentation from the same representation that we use for parsing. However, I'm not sure how practical this is right now.

I think this are actionable items:

  1. Make a proof of concept for some smallish type; trying to derive the schema and documentation, in addition to the JSON instances. We don't care much about the exact type representation, as we have dedicated types that are only used for parsing. So it's ok to clutter these types with TypeLits all over the place.
  2. If somebody wants to produce the schema in a more direct / manual fashion then that will be great too. This may yield results faster and could serve as a reference / golden test for a generic implementation, if we ever get to it.
tfausak commented 2 years ago

I haven't used it before, but you may be able to use Autodocodec for this.

sol commented 2 years ago

We currently have our own ad-hoc generic machinery in Data.Aeson.Config.FromValue that:

Not sure what's the situation here with other implementations + ideally, we would want to derive the reference (those tables in our current README) from that same representation.

@NorfairKing just in case, are these things within the scope of Autodocodec? Where are the tests located in the source tree btw? I just took a look but only saw a Doctest driver.

NorfairKing commented 2 years ago

@NorfairKing just in case, are these things within the scope of Autodocodec?

The warnings are in scope but not implemented. The field aliases are already supported. Deriving a json schema is the exact use-case of autodocodec. Even better would be a human-readible syntax-highlighted schema that you get with autodocodec-yaml.

Where are the tests located in the source tree btw? I just took a look but only saw a Doctest driver.

The readme has a section on this: https://github.com/NorfairKing/autodocodec#tests

sol commented 2 years ago

@NorfairKing looking at the example from the README, links to the generated output (schema, sample JSON, etc) could be a nice addition. I am not on a computer right now, so I can't try. And even when I'm back at the computer, I am swamped with a plethora of other things.

Questions:

  1. From what I understand, the HasCodec instance is used to derive all the other instances. The example still derives a Generic instance; for what is that instance used/necessary? Or is that Generic instance unused?
  2. Is there a generic implementation for HasCodec, or some other "generic way" to utilize this library?
  3. If there is no generic implementation, have you tried implementing this in a fully generic way? If yes, did you hit any road blocks?
NorfairKing commented 2 years ago

@NorfairKing looking at the example from the README, links to the generated output (schema, sample JSON, etc) could be a nice addition. I am not on a computer right now, so I can't try. And even when I'm back at the computer, I am swamped with a plethora of other things.

Those are here: https://github.com/NorfairKing/autodocodec/tree/master/autodocodec-api-usage/test_resources

The example still derives a Generic instance; for what is that instance used/necessary? Or is that Generic instance unused?

It's not necessary, that's used for another part of the tests.

Is there a generic implementation for HasCodec, or some other "generic way" to utilize this library?

No, and that's the point. You're forced to document every field in the schema (or deliberately circumvent that)..

If there is no generic implementation, have you tried implementing this in a fully generic way? If yes, did you hit any road blocks?

Yes I have. The roadblock is that it's a bad idea because the entire point is to document your implementation, which you are circumventing using a generic implementation.

sol commented 2 years ago

@NorfairKing looking at the example from the README, links to the generated output (schema, sample JSON, etc) could be a nice addition. I am not on a computer right now, so I can't try. And even when I'm back at the computer, I am swamped with a plethora of other things.

Those are here: https://github.com/NorfairKing/autodocodec/tree/master/autodocodec-api-usage/test_resources

I looked at that, it was just not immediately clear to me which exact files correspond to the example from the README. Not that important, though. Once you understand what the example is doing it's easy to imagine how the schema will look like.

The example still derives a Generic instance; for what is that instance used/necessary? Or is that Generic instance unused?

It's not necessary, that's used for another part of the tests.

Is there a generic implementation for HasCodec, or some other "generic way" to utilize this library?

No, and that's the point. You're forced to document every field in the schema (or deliberately circumvent that)..

Ok, I guess that makes sense. For Hpack specifically we could annotate the fields with type literals (as we already do for aliases) and generically derive the documentation from that. But I think that's only really an option if you use those types for parsing only, and nothing much else.

I like how the encoder encapsulates everything in a single value, btw.

sol commented 2 years ago

For completeness, I think conceptually this is what we would want:

{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE DataKinds #-}
module Person where

import           Data.Coerce
import           GHC.Generics
import           GHC.TypeLits

newtype AnnotatedField (documentation :: Symbol) a = AnnotatedField a

data Annotated
data Parsed

type family Field representation (documentation :: Symbol) a

type instance Field Annotated documentation a = AnnotatedField documentation a
type instance Field Parsed documentation a = a

data Person_ representation = Person {
  personName :: Field representation "name of person" String
, personAge :: Field representation "age of person" Int
} deriving Generic

type Person = Person_ Parsed
type AnnotatedPerson = Person_ Annotated

deriving instance Show Person
deriving instance Eq Person

parseAnnotatedPerson :: String -> AnnotatedPerson
parseAnnotatedPerson = genericParse
  where
    genericParse = undefined -- add generic implementation here

parse :: String -> Person
parse = undefined -- coerce . parseAnnotatedPerson

schema :: AnnotatedPerson -> String
schema = genericSchema
  where
    genericSchema = undefined -- add generic implementation here

Note that:

For Hpack we would need to extend AnnotatedField with additional information (cabal name, aliases, deprecation..).

lf- commented 2 years ago

I speculate unconfidently that you might need a usage of the RoleAnnotations extension to make that coercion work: https://gitlab.haskell.org/ghc/ghc/-/wikis/roles