marcosmko commented 4 years ago

Precheck

For bugs, do a quick search and make sure the bug has not yet been reported.
Finally, be nice and have fun!

Environment

Elixir version (elixir -v): 1.9.1
Neo4j and version (Neo4j 3.5.3, etc.): 3.5.8
Connection scheme (bolt://, bolt+routing:// or neo4j://): bolt://
Bolt.Sips version (mix deps): 2.0.0-rc.2
Operating system: Phoenix running on MacOS 10.14, Neo4j running on a docker inside linux VM

Current behavior

Hello! I'm developing a simple server using phoenix and neo4j and started using Bolt_Sips for that. But strangely, a simple query, returning a list of 120 nodes, takes too much time.

Query:

    cypher = """
      MATCH (p: Place)
        RETURN p as place;
    """

(note: p has only a "name" variable)

The following code takes too much time to execute (around 2 seconds in the first try and 1.4 seconds in the second try):

Bolt.Sips.query!(Bolt.Sips.conn, cypher)

I have checked the execution time using

    :timer.tc(fn -> Bolt.Sips.query!(Bolt.Sips.conn, cypher) end)
      |> elem(0)
      |> Kernel./(1_000)
      |> IO.inspect

When querying directly from neo4j using the web dashboard, I got 281ms in the first try and 56ms in the second try, much less than using bolt_sips.

Expected behavior

Much faster query.

Am I doing something wrong? :(
Or is this the expected behaviour?

dominique-vassard commented 4 years ago

Hello, your query and how you execute it is fine. And unfortunately yes, there is quite a big overhead from BoltSips. We will investigate to understand where this comes from and fix it. It can take some time as it can deeply impact the internals of bolt sips.

A quick workaround for your problem is to narrow your query to only what you need, i.e instead of MATCH (p:Place) RETURN p AS place, you can use MATCH (p:Place) RETURN p.name. This way, instead of retrieving something like this:

Bolt.Sips.Response{
  bookmark: "neo4j:bookmark:v1:tx4",
  fields: ["p"],
  notifications: [],
  plan: nil,
  profile: nil,
  records: [
    [
      %Bolt.Sips.Types.Node{
        id: 0,
        labels: ["Place"],
        properties: %{"name" => "place_1"}
      }
    ],
    [
      %Bolt.Sips.Types.Node{
        id: 1,
        labels: ["Place"],
        properties: %{"name" => "place_2"}
      }
    ]
  ],
  results: [
    %{
      "p" => %Bolt.Sips.Types.Node{
        id: 0,
        labels: ["Place"],
        properties: %{"name" => "place_1"}
      }
    },
    %{
      "p" => %Bolt.Sips.Types.Node{
        id: 1,
        labels: ["Place"],
        properties: %{"name" => "place_2"}
      }
    }
  ],
  stats: [],
  type: "r"
}

you retrieve something like this:

%Bolt.Sips.Response{
  bookmark: "neo4j:bookmark:v1:tx4",
  fields: ["p.name"],
  notifications: [],
  plan: nil,
  profile: nil,
  records: [["place_1"], ["place_2"]],
  results: [%{"p.name" => "place_1"}, %{"p.name" => "place_2"}],
  stats: [],
  type: "r"
}

And query time will be divied by 3...

marcosmko commented 4 years ago

Hmm okay, thanks!

I will try your suggestions, and if I have some time, I will try to investigate where is the bottleneck (although I'm just an amateur in elixir).

Cheers.

dominique-vassard commented 4 years ago

Ok, found something with Code.ensure_loaded called too many times....

The test runs on 1000 nodes.

First, it was

Name           ips        average  deviation         median         99th %
run          0.152         6.59 s     ±1.41%         6.62 s         6.66 s

After change, it becames:

Name           ips        average  deviation         median         99th %
run           8.53      117.28 ms     ±5.42%      116.77 ms      140.41 ms

And if logs are disabled, it is:

Name           ips        average  deviation         median         99th %
run          32.82       30.47 ms     ±4.95%       30.20 ms       35.21 ms

It's a little bit faster ;)

I've just have to check one other thing and then I'll commit.

florinpatrascu commented 4 years ago

W⦿‿⦿t!

We’ll have the 2.0 release this week ❤️

dominique-vassard commented 4 years ago

Just a quick ad: @marcosmko if you're using Ecto, https://github.com/dominique-vassard/ecto_neo4j may interest you. Current available version on hex is crap but as soon as bolt_sips 2.0 is available, the new nice version will be released :)

florinpatrascu commented 4 years ago

(if we have travis fixed and passing, then we'll push 2.0 tonight :) )

florinpatrascu commented 4 years ago

and we're 2.0.2 =)

https://hexdocs.pm/bolt_sips/getting-started.html#content

Thank you all for your patience, feedback and code contributions!

You're awesome! ❤️

marcosmko commented 4 years ago

Hello all! I was on a trip last week so I didn't have time to answer 😆

I have just tested here, and it's f***** awesome! Testing with 100 nodes, it became from 1s to 50ms.

Thanks all for your hard work! I hope that I can contribute in the future, as I am still learning elixir 😍

@dominique-vassard Okay, thank you! I will see and try it 😄

florinpatrascu commented 4 years ago

Haha, noiceee 😀 Thank you, for confirming the changes had a positive impact ❤️

florinpatrascu / bolt_sips

Simple Query taking too much time to process #73

Precheck

Environment

Current behavior

Expected behavior