rgeo / rgeo-proj4

Proj4 extension for rgeo.
MIT License
13 stars 14 forks source link

Upgrade PROJ API to 6.2+ #10

Closed keithdoggett closed 3 years ago

keithdoggett commented 3 years ago

This PR keeps the current C extension interface the same but upgrades the underlying PROJ version to 6.2+ instead of using the deprecated API.

This could be a good candidate for a minor release that will work on newer versions of Ubuntu and get the gem away from the deprecated API. Eventually, though, a thorough upgrade should be done that pulls in all of the new features of the PROJ library and use current best practices for PROJ (proj string syntax seems to have changed a bit since this was released).

This is not currently passing all the tests but is passing the majority of them, so hopefully just a few tweaks can get this to being releaseable.

keithdoggett commented 3 years ago

@arkirchner this is what I have so far. Looks like the .travis.yml and extconf.rb files need to be updated as well, but it may work better than the tests show on your system if you want to try it.

Feel free to make any contributions or leave suggestions here. Thanks!

arkirchner commented 3 years ago

@keithdoggett Sorry for the late replay. Thank you very much for sharing your current progress! I will try my best to help!

arkirchner commented 3 years ago

I tried to get forward with the C implementation. The last 3 days I worked on this I made no progress ... 😢

I tried to not convert the proj strings into a CRS with +type=crs. This will create a transformation object instead of an CRS. The only thing I have achieved with that is creating the old canonical string.

pj = proj_create(0, '+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs');
str = proj_pj_info(pj).definition;

str will be proj=longlat ellps=WGS84 datum=WGS84 no_defs towgs84=0,0,0. So the original canonical without the + in front.

Well this doesn't help much but I felt like sharing. I try to a version in ffi for now. If I find a solution that way I will implement it in C.

I hope you have better luck than me!

keithdoggett commented 3 years ago

Thanks for looking into this more. If you want to discuss anything about the changes you make, I'd be happy to take a look.

mzagaja commented 3 years ago

@keithdoggett Is there a timeline for when this is going to be finished, or is there a particular thing that is preventing it from being finished?

keithdoggett commented 3 years ago

@mzagaja, unfortunately, no. This has fallen in priority a bit, but I can probably take another try at it in a month/month and a half. I can also review PRs and give guidance if you want to try to modify this branch.

If I remember correctly, this was pretty close to working, but a lot of the tests needed to be rewritten because the proj string format changed slightly and a "+" is always needed to indicate a CRS is being specified. Have you tried running this branch?

mzagaja commented 3 years ago

@keithdoggett I have not, but if the modifications required are that minor then when I have some bandwidth I'm happy to take a run at it .

keithdoggett commented 3 years ago

@mzagaja I took another look at it last night. There are about a dozen test failures, but it looks like ~10 of them are due to some minor issue.

Also what needs to be added is the "+type=crs" flag to every proj string to specify it's a CRS, not just a "+". In a few weeks I can probably dedicate some time to working on it, but if you take a look in the meantime, let me know.

keithdoggett commented 3 years ago

@mzagaja @arkirchner the version I have up passes locally for me (Ubuntu 18). I'm having issues getting proj to install on Ubuntu runners in the CI. Looks like the macos runners are having issues installing geos or linking it to RGeo.

I'll take another look at getting the CI figured out soon, but in the meantime, if you want to try/review the changes that would be helpful. Thanks!

keithdoggett commented 3 years ago

For some reason, the Ubuntu CI runners in this repo can't install a proj version >4.9, even though with the exact same CI.yml file it works on my fork (https://github.com/keithdoggett/rgeo-proj4/runs/1965549117?check_suite_focus=true) and installs version 6.3.

Edit: My tests are using Ubuntu 20 while these are using Ubuntu 18, even though they both specify ubuntu-latest as the os.

keithdoggett commented 3 years ago

Thanks for the feedback!

Some other points of attention:

* Couldn't we leverage usage of `GC.compact` right now since we are doing all those changes ? You can take a look at [the reference](https://github.com/ruby/ruby/blob/master/doc/extension.rdoc#label-C+struct+to+Ruby+object) or [this great talk](https://youtu.be/1F3gXYhQsAY?t=1868) by @tenderlove (I've pointed to the part about updating a lib to support compaction, but the whole talk is great really)

Yes, I definitely think we should use the newer features where possible and this will be helpful to figure out the best practices on a smaller extension here before moving into the core library. I'll take a look through those soon.

* Did you try the new version for potential memory leaks ? Since we're not expert enough, I think we should use valgrind and/or some other tool to test what you've done here and make sure we won't induce leaks..

Not yet, but I can take a look sometime this week.

I'd be glad to help on these two points whenever I can find enough time. And the GC compact part may not be mandatory, it's just a plus..

Thanks that would be great. We might as well at least try to figure out GC.compact since we're already making big modifications to it.

BuonOmo commented 3 years ago

@keithdoggett one more insight on the GC.compact, we should be really careful on this one since i saw we use RSTRING_PTR for the original_str. Since GC.compact moves references, the pointer stored in C (in the proj4 api) may be pointing to another resource...

keithdoggett commented 3 years ago

@BuonOmo I added support for compaction per the video you sent and your PR on oj (https://github.com/ohler55/oj/pull/650/commits/6e88d14556badf2413f8d9bef5bfc8414561b4bb). I also changed from RSTRING_PTR to StringValuePtr which should be safer with compaction. Do you think we need to add a test for compaction?

The last thing left at this point should just be to find a way to test for memory leaks programmatically. I've done a little bit of research, but will look a little more to see how we could integrate valgrind into a rake task.

BuonOmo commented 3 years ago

@keithdoggett I don't have time to review everything yet unfortunately :/. However, I stumbled over two things that should be interesting:

  1. On RSTRING->ptr (or RSTRING_PTR) vs StringValuePtr, I suggest reading the part struct RString of https://ruby-hacking-guide.github.io/object.html
  2. On testing GC.compact, there's a method for that since ruby 2.7 (https://docs.ruby-lang.org/en/2.7.0/GC.html#method-c-verify_compaction_references).
keithdoggett commented 3 years ago
1. On `RSTRING->ptr` (or `RSTRING_PTR`) vs `StringValuePtr`, I suggest reading the part _`struct RString`_ of https://ruby-hacking-guide.github.io/object.html

After reading through the guide, it seems that no matter how we access the char* it will be fine as long as we don't assign to it. Right now, we only pass it in to proj_create which is assigned to proj_data->pj. Since we never modify proj_data->pj we should be fine. I'm not sure if this method verifies the VALUE is an RSTRING first, but I can check that.

2. On testing GC.compact, there's a method for that since ruby 2.7 (https://docs.ruby-lang.org/en/2.7.0/GC.html#method-c-verify_compaction_references).

I found this article (https://alanwu.space/post/check-compaction/) which demonstrates using GC.verify_compaction_references in a test suite. We could easily add this to ours.

I tried looking at the outputs of GC.compact and GC.verify_compaction_references (they both have the same output), and it is difficult to verify that a specific in a testing context since there are many T_DATA objects. In an IRB shell, it is possible to see that only 1 T_DATA object is moved, but it is around 350 when running the tests. Personally, I think calling GC.verify_compaction_references(double_heap: true, toward: :empty) is enough to verify.

Edit: Interesting thing I found after testing this. If I just use GC.verify_compaction_references it works fine, but if I run it with the arguments, we get a core dump in the line string test on create in the core c extension. I'm going to leave it without arguments for now because this is probably a compaciton issue on the core library.

keithdoggett commented 3 years ago

@BuonOmo I've removed the compaction test for now because LineString creation in the core gem is causing a seg fault when it is enabled. I don't think there's any way to just test the Proj extension for it since the tests rely on RGeo objects. We should probably just test for it once the core library is updated.

Is there anything else you think we need to check before this is ready for release?

keithdoggett commented 3 years ago

Thanks for all the help on this @BuonOmo! 3.0.0 has just been released :tada: