eddelbuettel / rcppsimdjson

Rcpp Bindings for the 'simdjson' Header Library
116 stars 13 forks source link

Updating to simdjson 1.0 #70

Closed lemire closed 3 years ago

lemire commented 3 years ago

This PR should not be merged and released as is. But I wrote it to help fix the issue.

Ok. So what is going on in the transition to simdjson 0.9 that requires so many changes ?

Well. We were returning results in the form of an std::pair but we were not expecting people to use it as an std::pair... (although, to be fair, we sometimes did it ourselves) yet people did. And what happened was that people would just ignore the error field. They would then consume garbage, and complain that simdjson was producing garbage.

So we closed off the std::pair by making it a protected inheritance. Sadly, it breaks structured binding which was nice but never worked right with all C++ runtimes (libc did not like it).

This does not impact at all people who used our normal API. Want a double? Do double(element). But I understand that R does not want exceptions thrown? Well. Thankfully, we support both exception-based and no-exception usage.

What about rcppsimdjson? Well rcppsimdjson would do result.first to get the result and ignore the error code. This was, presumably, always safe because error conditions were otherwise checked.

We don't want you to continue doing that. Instead, you now must do result.value_unsafe().

Yes. It is ugly. But the use of value_unsafe() is meant to indicate that you are very much doing something unsafe. And that's what we want the code to show so that people are careful. The .first syntax does not have the same air of danger around it.

lemire commented 3 years ago

I am releasing simdjson 1.0 today, so I refreshed this PR with simdjson 1.0.

Before definitively releasing 1.0, I will wait to make sure that everything is green. (It should be!)

lemire commented 3 years ago

It is green.

eddelbuettel commented 3 years ago

Hm. I was about to merge this evening as it had nicely stabilized; now you chose to rock the boat again -- but thanks for updating to 1.0.0 and congrats on that release.

Can you please remove hacking.md? That does not belong here (maybe in the wiki?) and I tend to prefer a simpler / more compact style Docker (see eg here for a set or here for a concrete one.

lemire commented 3 years ago

now you chose to rock the boat again

I did not expect my change to 'rock the boat'. It was conservative update.

I will check the hacking file...

lemire commented 3 years ago

I see that you went ahead and removed it.

I honestly do not remember adding this file. I do have notes, but they are on a gist somewhere.

eddelbuettel commented 3 years ago

I will check the hacking file...

I just removed it, and merged.

I looked into C++14 and/or C++11 -- not a chance right now. Part of that may by now be our fault :-/ as @knapply and I thought we could plow ahead with C++17 given that you imposed it. I'd be up for opening an issue and cleaning this up.

But before that I think I should release what we have as either 0.1.6 or 0.2.0. It nice to have a refresher, and it is very nice to be current with simdjson 1.0 -- so big, big Thank You! to you for updating the PR for it. And, of course, for having made the PR in the first place.

lemire commented 3 years ago

Thanks to you.

It is definitively the case that simdjson supports C++11. I cannot recall when we went back and decided to support C++11 exactly, but I think that the credit goes to @jkeiser who worked a h*ll of a lot on making this happen. We now have C++11 tests in CI.

Note that now that you have merged this, it will become easier, in the future to adopt On Demand for extra performance.

eddelbuettel commented 3 years ago

I may not have made myself sufficiently clear.

As you know, there is code is RcppSimdJson that is not the included simdjson.{cpp,h}. If you go in and enforce a different compilation standard, easiest by just wrriting CXX_STD = CXX11, say, in this line

https://github.com/eddelbuettel/rcppsimdjson/blob/631ea2ec50f11e2bedbbb7497ea7ceee339343b3/src/Makevars.in#L3

as well as adjusting accordingly little corners likes this one

https://github.com/eddelbuettel/rcppsimdjson/blob/631ea2ec50f11e2bedbbb7497ea7ceee339343b3/src/deserialize.cpp#L1-L3

the you will see, with a tear or two in your eyes, that compilation goes belly up. Irregardless of the fine work you folks did upstream. To paraphrase, "we have met the enemy and he is in this code directory". PRs welcome. I won't have time to chase this, and as CRAN leaves me alone with C++17 as long as I am explicit about it (which 0.1.5 was not that sufficiently, the next release will be) I am also not all that concerned.

I hope this clarifies the issue. It is not your upstream code. It is use down here in RcppSimdJson currently making it C++17.

lemire commented 3 years ago

You were perfectly clear, I was just giving you the confirmation that we fully support C++11.

eddelbuettel commented 3 years ago

Yes for approximately the third time, and it is probably also the third time I am trying to explain that that alone does not buy me lunch, nor that I care too much. So I can assume we are both good on this now?

lemire commented 3 years ago

You and I have fun exchanges. :-) It is almost like we disagree all the time, except we never disagree.

Yes, we are good.

eddelbuettel commented 3 years ago

I think I'll ship it later to CRAN as 0.1.6.

Or should I wait for you to pop the cork on 1.0.0 over at https://github.com/simdjson/simdjson ?

eddelbuettel commented 3 years ago

This went to CRAN late-ish Central time and arrived there early today, see https://cran.r-project.org/package=RcppSimdJson

lemire commented 3 years ago

Great news. We have pushed simdjson 1.0 as well.

This was more work than I thought it would be.

I am a bit sad that we do not have (yet) a super fast On Demand wrapper. But Rome was not built in a day.