Open gallais opened 3 years ago
I think I could tackle this.
With the environment variables we don't need to deal with hierarchies of names, but as I understand it with logs we need to be able to deal with them (e.g. elab
, elab.with
etc). Do we want users to have to write out every possible "path" of log contexts/names, or do we want a smarter way of doing it (e.g. checking membership of a tree structure of some sort).
There are two aspects to this:
I think you are right that it would be good to print the whole thing not as a list
but as a hierarchical structure. We could do so by reusing (parts of) our existing
functions in Core.Options.Log
to generate a trie of documentation strings & printing it nicely.
It's probably worth writing this printer generically for any Libraries.Data.StringTrie
This trie could also be used to efficiently perform the check needed in the first part of the problem.
Declaring topics of interest is what I was specifically referring to, but generating documentation is something I'd not considered, and definitely something that should go into the design.
To clarify, what I was asking was whether "declaring" a topic of interest should be done in a "list like" way, i.e.
topics : List String
topics = [
"core",
"core.elab",
"syntax"
]
-- etc
or if it should be done in a "tree like" way, i.e.
-- or equivalent structure
record LogTree where
constructor MkTree
name : String
children : List LogTree
-- We can't have "" as the top level name, as otherwise we will require `.core.elab`
options : List LogTree
options = [
MkTree "syntax" [],
MkTree "core" [
MkTree "elab" [],
MkTree "data" []
]
]
I think there are good arguments for both - the former is a bit more readable I think, but the latter does make the tree a bit more explicit.
I think it should be done like this:
topics : StringTrie String
topics = foldr insert Trie.empty [
("core", "Anything related to core")
-- ^ ^- the associated doc
-- |
-- a topic
...
where insert : (String, String) -> StringTrie String -> StringTrie String
can be put together by using the topic-parsing logic we have in
Core.Options.log
together with the lib functions on tries.
How does this work for adding "subtopics"? Do we add them as discrete topics (e.g. core.elab
), and then later (i.e. at query or doc generation time) break them into a topic "tree"?
insert
into the StringTrie
will break down "core.elab"
into ["core", "elab"]
(cf. the logic in parseLogLevel
) and use that as the path to insert the documentation
into the trie
Query should perform the same split & use StringTrie
's lookup that focuses on the
subtree pointed at by the path (apparently it's not part of the library yet: we need
lookup : List String -> StringTrie a -> Maybe (StringTrie a)
).
i.e. a topic is valid if the topic (or any subtopic) is documented.
so elab
is valid even if elab
is not documented but elab.def
and elab.data
are,
it just acts as a synonym for all of its subtopics.
Thanks! That helps clarify a lot. I'll look into StringTrie
properly, as it definitely seems that I'm missing some knowledge there.
So I've had a hack and think I have a working-ish prototype. The only trouble is that I think Idris2' auto search is not deep enough to be able to effectively search within a trie structure. See the following gist for an example: https://gist.github.com/AdamHarries/2dc75d1efd574cdc405401bd8d3ddc15
Essentially, the type checker complains that it can't find an implementation for a successful search, but also (when flipped around) will also complain that it can't find an implementation for an unsuccessful search. I think it must be giving up part-way, but I'm not well versed enough to know what I can do top ask Idris2 to keep searching.
You can use the %auto_implicit_depth <n>
pragma to change the maximum depth for auto implicit search, the default value is 50. Obviously this will increase compile times, and it may hang if the search requires to much memory, but at least you can experiment with it. In my experience, if the datatype you are trying to construct automatically isn't branching deeply you can increase it far enough.
This is just doing normalisation though. The proof is extremely shallow
(a single Just
constructor).
Is the problem that triefind
is not seen as total and thus won't reduce
in the type? Try adding a %default total
at the top of the file?
Adding totality checking (either with %default total
, or with explicit total
s before triefind) didn't make any change. Additionally, bumping the auto implicit depth to 100,000 did nothing, which as you say should be expected.
looking at the logs, it fails with:
LOG eval.stuck:5: Stuck function: Libraries.Data.StringMap.empty
I think it's a visibility issue, all the StringMap
and StringTrie
functions are marked only as export
which does not reduce at compile times, unlike public export
.
Could you explain the difference between export
and public export
for me please? (Or point me somewhere which explains it).
Sure, take a look at the docs here https://idris2.readthedocs.io/en/latest/tutorial/modules.html#export-modifiers.
Cheers!
So that seemed to help the inlining/evaluation, but it is still failing to typecheck. It manages to get up to this expression before failing:
LOG compiler.inline.eval:10: Inlined: (%case !{arg:6} [(%concase Prelude.Types.Just Just 1 [{e:1}] (Treemember.triefind [___, !{arg:2}, !{e:1}])), (%concase Prelude.Types.Nothing Just 0 [] (%con Prelude.Types.Nothing Just 0 []))] Nothing)
This seems to make things work (you may have to add import Data.List
to your
module too: there is a bug in idris that prevent functions from reducing if they
are not in scope)
diff --git a/src/Libraries/Data/StringMap.idr b/src/Libraries/Data/StringMap.idr
index 6b57705e..fc63eb4b 100644
--- a/src/Libraries/Data/StringMap.idr
+++ b/src/Libraries/Data/StringMap.idr
@@ -75,6 +75,7 @@ merge3 (Branch2 a b c) d (Branch3 e f g h i) j k = branch6 a b c d e f g h i j k
merge3 (Branch3 a b c d e) f (Branch2 g h i) j k = branch6 a b c d e f g h i j k
merge3 (Branch3 a b c d e) f (Branch3 g h i j k) l m = branch7 a b c d e f g h i j k l m
+public export
treeLookup : Key -> Tree n v -> Maybe v
treeLookup k (Leaf k' v) =
if k == k' then
@@ -94,6 +95,7 @@ treeLookup k (Branch3 t1 k1 t2 k2 t3) =
else
treeLookup k t3
+public export
treeInsert' : Key -> v -> Tree n v -> Either (Tree n v) (Tree n v, Key, Tree n v)
treeInsert' k v (Leaf k' v') =
case compare k k' of
@@ -124,6 +126,7 @@ treeInsert' k v (Branch3 t1 k1 t2 k2 t3) =
Left t3' => Left (Branch3 t1 k1 t2 k2 t3')
Right (a, b, c) => Right (Branch2 t1 k1 t2, k2, Branch2 a b c)
+public export
treeInsert : Key -> v -> Tree n v -> Either (Tree n v) (Tree (S n) v)
treeInsert k v t =
case treeInsert' k v t of
@@ -204,20 +207,20 @@ data StringMap : Type -> Type where
Empty : StringMap v
M : (n : Nat) -> Tree n v -> StringMap v
-export
+public export
empty : StringMap v
empty = Empty
-export
+public export
singleton : String -> v -> StringMap v
singleton k v = M Z (Leaf k v)
-export
+public export
lookup : String -> StringMap v -> Maybe v
lookup _ Empty = Nothing
lookup k (M _ t) = treeLookup k t
-export
+public export
insert : String -> v -> StringMap v -> StringMap v
insert k v Empty = M Z (Leaf k v)
insert k v (M _ t) =
@@ -245,7 +248,7 @@ export
fromList : List (String, v) -> StringMap v
fromList l = foldl (flip (uncurry insert)) empty l
-export
+public export
toList : StringMap v -> List (String, v)
toList Empty = []
toList (M _ t) = treeToList t
@@ -266,7 +269,7 @@ treeMap f (Branch2 t1 k t2) = Branch2 (treeMap f t1) k (treeMap f t2)
treeMap f (Branch3 t1 k1 t2 k2 t3)
= Branch3 (treeMap f t1) k1 (treeMap f t2) k2 (treeMap f t3)
-export
+public export
implementation Functor StringMap where
map _ Empty = Empty
map f (M n t) = M _ (treeMap f t)
Ah, I avoided the first three public export
(on the various tree insert etc functions) as I assumed that was part of a private API. That makes sense though.
Is the bug affecting Data.List, or the other functions from StringMap?
The Data.List
import is so that split
reduces. We had similar issues in #1086
Apologies for the radio silence - work has been rather hectic the past couple of weeks. I'm hopeful that I'll be able to make some progress on this at the weekend.
So I've had a crack at a bit of it, and made some limited progress, as seen here: https://github.com/idris-lang/Idris2/pull/1180
It's proving a little difficult to stick to the original design. Specifically, I've simplified and modified the proof burden to remove references to any other modules, as they require that any caller also import that module. e.g. if our function is foo : (arg : A) -> {auto proof : IsJust somecall} -> B
, then anyone calling foo
will have to import Data.Maybe
so that IsJust
is available to them. Instead, we simply rely on boolean equality, and convert to bools within the somecall
function. This probably causes more evaluations, but is much more ergonomic for the users I think.
Additionally, evaluation seems to be a bit of a problem when we have as many options as we do (I've used grep/sort/uniq to extract them). Firstly, there were a few options that simply wouldn't resolve to either True
or False
, and an experiment with my "older" sketch (see https://gist.github.com/AdamHarries/33c54afc57f735884ff31dfa6124ae4b) hangs Idris2 by causing it to go into an infinite loop. I think this may be an issue with statically constrained logging as we have a lot of existing options, unless we re-think or re-design logging in general.
Building the trie at compile time may be more expensive than I assumed. And it may be that we don't have enough caching of the normal form of toplevel values and that the trie is being rebuilt every time we try to check for membership?
How bad is it if we work directly on the assoc list of (String, String)
?
(Also, I don't think we need to public
export any of the logX
)
I had a look at this. If instead of using a StringTrie
we just use a list of pairs and lookup the whole log name there, it typechecks instantaneously. For pretty-printing the tree of log levels we could still go via some other data structure if necessary, but for compiletime safety (= not using an undefined loglevel) the list should be enough (gist: https://gist.github.com/stefan-hoeck/477ec6dc5e7344a14d01f522a94496f9).
@AdamHarries are you still working on this? Otherwise I could continue here, in order to get a complete list of log topics for #1397.
Hi @stefan-hoeck! Please feel free to take over this issue from me - I sadly don't have as much time for OSS work as I did when I initially picked it up, so I won't be working on it any time soon.
Thanks for your quick reply! I'll give this a try then.
At the moment we have quite extensive (and everyday expanding) logging options to get Idris to print more debug output. However these are not documented anywhere and the best way to discover them is by
grep
ping the source code. This is less than ideal.In
Idris.Env
we use a nifty trick to ensure that all of the environment variables we attempt to observe in the codebase are thoroughly documented.It would be nice to have the same enforcement for logging options.
NB: there are 450+ uses of
log
in the codebase so it'll take a bit of time to collect all of the existing topics. The documentation part can be left for a future PR: having an exhaustive list would already be pretty useful!