Open hdgarrood opened 4 years ago
Also, the issues purescript/pursuit#85 or purescript/pursuit#139 have languished for quite a while, and although we probably shouldn't try to address them immediately, I think if we are going to rethink Pursuit's architecture we should at least think a bit about how these features could fit in, so that we can build them later. I guess if we were going for the approach I described above, then the natural approach would be to have the search server house the database which is capable of answering queries such as "what was the earliest version of this package to define this identifier", and then perhaps the HTML generation job could query the search server's API while generating the documentation and interleave that information in in the form of a "Since: v0.1.0" or something like that.
Perhaps what we really just need to do is to define a simplified representation of PureScript types for storage in the Pursuit database, which retains just enough structure to be useful for type search, but not so much that changes to the compiler can break it. I'm thinking: remove the constructors which are not relevant to type search, such as TUnknown
, Skolem
, ParensInType
, BinaryNoParensType
, and wildcards (they should be expanded during the initial docs generation), forget about any details which we don't absolutely need, such as foralls (Forall
) and kind annotations (KindedType
), and potentially also represent rows in a slightly more convenient way. So perhaps the following could work as a starting point:
data Type a
-- | A named type variable
| TypeVar a Text
-- | A type-level string
| TypeLevelString a PSString
-- | A type constructor
| TypeConstructor a (Qualified (ProperName 'TypeName))
-- | A type operator.
| TypeOp a (Qualified (OpName 'TypeOpName))
-- | A type application
| TypeApp a (Type a) (Type a)
-- | A binary type operator application
| TypeOpApp a (Type a) (Type a) (Type a)
-- | A type with a set of type class constraints
| ConstrainedType a (Constraint a) (Type a)
-- | A row
| Row a [(Label, Type a)]
@hdgarrood how would you see docs-search in the picture?
I think we could actually replace the backend-side search from Pursuit with frontend-side search (using docs-search), e.g. see Starsuit
My only problem with moving search to the frontend is that you need to have every version of every package around to be able to answer queries such as "what was the earliest version that this function was in" or "how did this module's interface change between v1.0.0 and v2.0.3", and I don't think that will be feasible with frontend search. At least, the size of the index is going to become a problem much sooner than it will with backend search that way. I'd prefer not to make an architectural change that makes it harder to answer these questions if we can avoid it.
@hdgarrood I think that could be solved by having a richer search index right? I assume we don't want to enable all possible queries out there (since we'd need special support every kind of query anyways), so adapting the search index to have facilities to answer them sounds like it could work?
No, I don’t think so - the problem with front end search is that you have to worry about the size of the index. My worry is that including enough information in the index to be able to answer these queries will cause the index to become too large much more quickly.
@hdgarrood makes sense. I think at this point then a goal worth pursuing (heh, pun intended) could be to have a single codebase for frontend and backend search, so that we don't split efforts. (the assumption here is that having offline/local search is useful and desirable) So how would you feel about having that part of Pursuit (read: the backend that answers search queries) in PureScript?
I’m not particularly keen; Pursuit’s search already exists and is written in Haskell, and also obviously the compiler is written in Haskell so I don’t want to make it more difficult to make use of nice things that the compiler can give us. For example, writing the backend in PureScript essentially rules out the possibility of using the same type search that the compiler’s typed holes feature uses.
While I think writing it in PureScript would make it easier for people to contribute because it's already the language they know, PS isn't that mature on the backend. So, even if this was done, wouldn't this slow down development?
@JordanMartinez note that this is already done in docs-search - what do you refer to when you say "slow down development"?
I'm assuming that this will require implementing a server, and I feel like PS doesn't yet have as good ecosystem for building such a thing when compared with Haskell. It seems like one would need to reinvent the wheel a few times and that's what would "slow down" the development.
So, this is just a general feeling/belief I have about the situation, not something based on fact.
Couldn’t we extract the type search into a PureScript package and wrap a small TCP server around it for usage by Pursuit? We could even try the native backend if Node.js is an issue.
I started to think about replicating the search index to the browser IndexedDB inside a Service Worker for supporting offline searches on Pursuit, and getting different results offline due to subtle differences between two implementations of the search wouldn’t be ideal.
Is the possibility of reusing the typed holes search really more important than reusing the same implementation for online and offline searches on Pursuit and also for local searches in the compiler generated documentation with a web browser or purescript-docs-search
CLI?
I think supporting both online and offline searching in Pursuit would be overly complicated for minimal benefit, so I’m not keen on that. Also, it’s not just the possibility of using typed holes search; that was just one example. To give another potential example, I want to implement comparison between module interfaces at different versions of a library inside the compiler, so that the compiler can say eg “this needs a major bump” when publishing a new version. That’s something we would most likely want to be able to use inside Pursuit too, which is why I think it makes the most sense for us to stay in Haskell. If we want to support searching in locally produced documentation and we think it’s really important that they behave in exactly the same way, then I would rather move that search functionality into the compiler so that it can also be used locally.
Then again I suppose we could have a hybrid backend with parts in both PureScript and Haskell. I think I need to consider what the actual API exposed by this backend would look like in a bit more detail. I’m not 100% sure we should even consider it a problem that search results might differ. For example, using typed hole search is still very appealing to me, as it would allow much better search results (since the results of typed hole search are guaranteed to type check), but of course that isn’t a possibility with frontend search. To give another example, the Pursuit backend needs to be aware of all versions of all libraries, but from the perspective of docs generated locally, you only care about the current package set. It’s not yet clear to me what that might mean concretely, but it seems plausible that these scenarios are different enough that we shouldn’t consider it a problem if they behave a bit differently.
Then again I suppose we could have a hybrid backend with parts in both PureScript and Haskell
That’s exactly my point. I don’t even suggest to have the PureScript search server handle HTTP requests itself if a TCP server means that more things can stay in Haskell.
To give another example, the Pursuit backend needs to be aware of all versions of all libraries, but from the perspective of docs generated locally, you only care about the current package set. It’s not yet clear to me what that might mean concretely, but it seems plausible that these scenarios are different enough that we shouldn’t consider it a problem if they behave a bit differently.
I was assuming that even when Pursuit will have knowledge of all versions of all packages it should be able to answer for requests inside specific packages set. Am I mistaken?
Reading through this issue again it seems like it's more related to Pursuit itself rather than the Registry, so I'll move it over there and clarify the title
Quite a bit of Pursuit's design followed from the constraints imposed by the fact that we didn't have a registry of our own, so this is probably a good opportunity to revisit that.
Currently, Pursuit only accepts JSON package uploads, using the schema produced by the compiler when you run
purs publish
, which is defined inLanguage.PureScript.Docs.Types
. The original reason for this is twofold:The current architecture has a few drawbacks:
Type
data type, for representing PureScript types, appears in the JSON schema. For instance, polykinds is likely to cause breaking changes to the format. Breaking changes to the format usually necessitate regenerating the database, which I think we have done two or three times now, and it usually means that older packages can no longer be hosted on Pursuit, which is a shame.I'd like to investigate splitting Pursuit into a couple of separate services: a job, probably associated with this repo, which can generate and upload static HTML docs to pursuit.purescript.org, and which also can generate search index data to upload to a search server. The static HTML docs could potentially be hosted on GH pages, and the search server would be run in DigitalOcean on our own infrastructure. That way, if the search server goes down, people can still access static HTML docs.
There's a few things we'd need to be careful of: for example, making sure that links don't break will require a bit more care since we won't be able to take advantage of Yesod's type-safe routes any more.