mirage / ocaml-uri

RFC3986 URI parsing library for OCaml
Other
97 stars 57 forks source link

Add pct encoder #147

Closed orbitz closed 3 years ago

orbitz commented 4 years ago

@avsm So this implements the customization we talked about.

I think it works well for to_string but I feel it clutters up the individual string accessors. It also elucidates that not all component accessors perform pct encoding, I'm not sure if that is deliberate or not.

What do you think?

orbitz commented 4 years ago

@avsm I got this to pass tests. I don't know if what I did is correct, and I did not clean up all existing work (specifically there seems to be some Uri_re files left. I don't know enough about @anmonteiro change to fix those, but if someone wants to direct me I'm happy to look at it.

But, this is at least mergable in the "build is green" sense of the word.

anmonteiro commented 4 years ago

I don't think deleting the legacy tests is accurate. I think the problem here is that the tests for the uri package depend on the uri-re package, which is not a dependency that we want to publish, and CI doesn't find them for some reason.

orbitz commented 4 years ago

I deleted the legacy test because the Uri_legacy model couldn't be found on testing. Missing something else?

Is it correct that the legacy tests are verifying that this new implementation matches the old implementation? If so, is that necessary?

anmonteiro commented 4 years ago

I deleted the legacy test because the Uri_legacy model couldn't be found on testing. Missing something else?

This is what I was trying to express above:

the tests for the uri package depend on the uri-re package

the uri-re package is where the legacy implementation now lives.

orbitz commented 4 years ago

Ah ok. Is that something you want to fix in your branch and I can rebase my changes on it? Or do you can just tell me what to do to fix it.

anmonteiro commented 4 years ago

I'm not sure how to fix it. I think it's a problem of how CI runs the tests, something along the lines of opam install -t .... I'm not an OPAM user so I'd appreciate more guidance here.

orbitz commented 4 years ago

@avsm can you lend us any insight?

tmcgilchrist commented 4 years ago

This looks like some internal timeout in the build infrastructure, not sure @avsm


#1 sha256:a1d94883c64a2cbdf64f0985d646ac381e7827f60258d78cb25670fd48ca96d8
#1 ...

#2 [internal] load .dockerignore
#2 sha256:5828acd1839bccbb9930d2cf0923fd4e7a3d81fd915f7aa74a63bd2dd028897f
#2 ERROR: no active session for jzb7zv9j8b3tuubj4a8siukmx: context deadline exceeded

#1 [internal] load build definition from Dockerfile
#1 sha256:a1d94883c64a2cbdf64f0985d646ac381e7827f60258d78cb25670fd48ca96d8
#1 ERROR: no active session for jzb7zv9j8b3tuubj4a8siukmx: context deadline exceeded
------
 > [internal] load .dockerignore:
------
------
 > [internal] load build definition from Dockerfile:
------
failed to solve with frontend dockerfile.v0: failed to resolve dockerfile: failed to build LLB: no active session for jzb7zv9j8b3tuubj4a8siukmx: context deadline exceeded
docker-build failed with exit-code 1
2020-08-28 12:59.56: Job failed: Failed: Build failed```
tmcgilchrist commented 4 years ago

Rebuild the failing parts and they pass now. Strongly suspect there was a CI timeout environment issue.

orbitz commented 4 years ago

@tmcgilchrist Nice! Thank you. @avsm How's this change looking to you?

orbitz commented 3 years ago

@avsm @tmcgilchrist @anmonteiro One thing I noticed is the new parser means uri does not work in jsoo. Is this desired?

anmonteiro commented 3 years ago

@orbitz Could you clarify why that is the case? The only change that I could imagine would cause that is the switch to Angstrom, though that has JSOO support (see https://github.com/inhabitedtype/angstrom/pull/187)

orbitz commented 3 years ago

@anmonteiro ahhhh you're right, I had some old dependencies. Upgraded and everything works.

Any idea how to get this version of uri released?

dinosaure commented 3 years ago

So, this PR has what @anmonteiro did on #142 if I understand correctly? If it's the case, can we close #142?

Then, I can take my time to:

orbitz commented 3 years ago

So it looks like I got bit by opam caching some results and it does not work. I get the following error:

bigstringaf_blit_from_bytes not implemented

I created the following topic to see if anyone has suggestions.

https://discuss.ocaml.org/t/bigstringaf-blit-from-bytes-not-implemented/6524

dinosaure commented 3 years ago

I tried your PR (with opam pin) and did this little executable:

let v = Uri.of_string "http://localhost/" in
Format.printf "%a\n%!" Uri.pp v

Which is compiled with js_of_ocaml:

(executable
 (name main)
 (modes js)
 (libraries uri))

It seems that all works for me when I do:

$ nodejs _build/default/main.bs.js
https://localhost/

My version is angstrom.0.14.1 with bistringaf.0.6.1.

orbitz commented 3 years ago

All is good. With some help from @dinosaure I figured out the issue was on my end based on how I'm compiling my program.

@dinosaure for your original points above: the interface to Uri should not have changed so I don't think releasing this will impact the ecosystem.

dinosaure commented 3 years ago

@dinosaure for your original points above: the interface to Uri should not have changed so I don't think releasing this will impact the ecosystem.

Yes it's mostly about behavior when uri moved to angstrom - and we should take care about some details according to MirageOS (but from what I know, all should be good). I tested your branch on my universe and my unikernels and I did not get any weird behaviors (however, it's not a truly check). For my perspective, it's ready to merge, however I would like to have the opinion of @mirage/core before - so just waiting one week and we will be ready to make a release :+1:.

dinosaure commented 3 years ago

Let's start for a release 👍

avsm commented 3 years ago

Only one way to find out :-) It's all good to me by inspection. This is not a gater to the release, but I'd like to compare the old and new implementation using memtrace to check that the new one is more constant-memory than the old regex based one.