neo4jrb / activegraph

An active model wrapper for the Neo4j Graph Database for Ruby.
http://neo4jrb.io
MIT License
1.4k stars 276 forks source link

Class.all gives "NoMethodError: undefined method `to_sym' for nil:NilClass" (Probably data entry error) #1354

Open Joshfindit opened 7 years ago

Joshfindit commented 7 years ago

Apologies for the poor quality of this issue submission: This came up on a dev branch with a bunch of changes. If this gives you any pointers for info that I can submit to help debug, let me know.

Command that causes the error

Class.all

Expected result

An array containing all nodes from that class

Actual result

NoMethodError: undefined method `to_sym' for nil:NilClass
Did you mean?  to_s
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/property.rb:22:in `association_key?'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/property.rb:29:in `block in contains_association?'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/property.rb:29:in `each_key'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/property.rb:29:in `contains_association?'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/property.rb:15:in `extract_association_attributes!'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/initialize.rb:11:in `init_on_load'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/node_wrapper.rb:9:in `block (2 levels) in <top (required)>'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/node_wrapper.rb:8:in `tap'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/node_wrapper.rb:8:in `block in <top (required)>'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/wrappable.rb:27:in `wrap'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/wrappable.rb:11:in `wrap'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/responses/bolt.rb:95:in `wrap_by_level'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/responses/bolt.rb:102:in `wrap_node'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/responses/bolt.rb:75:in `wrap_structure'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/responses/bolt.rb:41:in `wrap_entity'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/responses/bolt.rb:39:in `map'
... 12 levels...
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/adaptors.rb:117:in `block in queries'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/adaptors.rb:142:in `transaction'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/adaptors.rb:201:in `new_or_current_transaction'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/adaptors.rb:116:in `queries'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/adaptors.rb:108:in `query'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session.rb:36:in `block (2 levels) in <class:CypherSession>'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j-core/query.rb:240:in `response'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j-core/query.rb:311:in `pluck'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/query/query_proxy_enumerable.rb:88:in `pluck'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/query/query_proxy_enumerable.rb:26:in `result'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/query/query_proxy_enumerable.rb:13:in `each'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/query/query_proxy_eager_loading.rb:6:in `each'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/query/query_proxy.rb:193:in `to_a'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/query/query_proxy.rb:193:in `[]'
    from (irb):1
    from /usr/local/bin/irb:11:in `<main>'

Note

Other classes are perfectly fine, as are other docker containers with the same app and different data.

Steps that I believe caused this error

I was only watching the return values for a while, and when I went back to list all the nodes, this error came up.

Seems like I was able to submit data in to Neo4j that could not be read again.

Runtime information:

Neo4j database version: neo4j gem version: 8.0.6 neo4j-core gem version: 7.0.3

Joshfindit commented 7 years ago

Here's some info that may uncover the error as well:

I went through each node individually by UUID, and all were able to be called, but some came back with truncated properties.

From Neo4j:

name    172.28.0.1@1486310400.8555803
normalized_name 172.28.0.1@1486310400.8555803
raw_input   {:body=>{"value"=>"Datapoint value 4 (Cleaned)"}, :header=>{"CONTENT_LENGTH"=>"42", "CONTENT_TYPE"=>"application/json", "GATEWAY_INTERFACE"=>"CGI/1.1", "PATH_INFO"=>"redacted", "QUERY_STRING"=>"", "REMOTE_ADDR"=>"172.28.0.1", "REMOTE_HOST"=>"172.28.0.1", "REQUEST_METHOD"=>"POST", "REQUEST_URI"=>"redacted", "SCRIPT_NAME"=>"", "SERVER_NAME"=>"127.0.0.1", "SERVER_PORT"=>"4455", "SERVER_PROTOCOL"=>"HTTP/1.1", "SERVER_SOFTWARE"=>"WEBrick/1.3.1 (Ruby/2.3.3/2016-11-21)", "HTTP_CACHE_CONTROL"=>"no-cache", "HTTP_POSTMAN_TOKEN"=>"redacted", "HTTP_AUTHORIZATION"=>"Token token=redacted", "HTTP_USER_AGENT"=>"PostmanRuntime/3.0.5", "HTTP_ACCEPT"=>"*/*", "HTTP_HOST"=>"127.0.0.1:4455", "HTTP_ACCEPT_ENCODING"=>"gzip, deflate", "HTTP_CONNECTION"=>"keep-alive", "rack.url_scheme"=>"http", "HTTP_VERSION"=>"HTTP/1.1", "REQUEST_PATH"=>"redacted", "rack.request.query_string"=>"", "sinatra.route"=>"POST redacted"}}
value   Datapoint value 4 (Cleaned)
uuid    8e19341e-bfa8-4bc5-b827-bc169847e219

The result of Datapoint.find("8e19341e-bfa8-4bc5-b827-bc169847e219")

Datapoint
 MATCH (n:`Datapoint`)
 WHERE (n.uuid = {n_uuid})
 RETURN n
 ORDER BY n.uuid
 LIMIT {limit_1} | {:n_uuid=>"8e19341e-bfa8-4bc5-b827-bc169847e219", :limit_1=>1}
=> #<Datapoint uuid: "8e19341e-bfa8-4bc5-b827-bc169847e219", name: "172.28.0.1@1486310400.8555803", normalized_name: "172.28.0.1@1486310400.8555803", raw_input: "{:body=>{\"value\"=>\"Datapoint value 4 (Cleaned)\"}, :header=>{\"CONTENT_LENGTH\"=>\"42\", \"CONTENT_TYPE\"=>\"", value: "Datapoint value 4 (Cleaned)">
Joshfindit commented 7 years ago

Destroyed the nodes starting with the newest and working back until Datapoint.all worked.

It did not work until the following node was destroyed

name    172.28.0.1@1486310376.4773538
normalized_name 172.28.0.1@1486310376.4773538
raw_input   {:body=>{"value"=>"Datapoint value (Cleaned)"}, :header=>{"CONTENT_LENGTH"=>"40", "CONTENT_TYPE"=>"application/json", "GATEWAY_INTERFACE"=>"CGI/1.1", "PATH_INFO"=>"/redacted", "QUERY_STRING"=>"", "REMOTE_ADDR"=>"172.28.0.1", "REMOTE_HOST"=>"172.28.0.1", "REQUEST_METHOD"=>"POST", "REQUEST_URI"=>"redacted", "SCRIPT_NAME"=>"", "SERVER_NAME"=>"127.0.0.1", "SERVER_PORT"=>"redacted", "SERVER_PROTOCOL"=>"HTTP/1.1", "SERVER_SOFTWARE"=>"WEBrick/1.3.1 (Ruby/2.3.3/2016-11-21)", "HTTP_CACHE_CONTROL"=>"no-cache", "HTTP_POSTMAN_TOKEN"=>"redacted", "HTTP_AUTHORIZATION"=>"Token token=redacted", "HTTP_USER_AGENT"=>"PostmanRuntime/3.0.5", "HTTP_ACCEPT"=>"*/*", "HTTP_HOST"=>"redacted", "HTTP_ACCEPT_ENCODING"=>"gzip, deflate", "HTTP_CONNECTION"=>"keep-alive", "rack.url_scheme"=>"http", "HTTP_VERSION"=>"HTTP/1.1", "REQUEST_PATH"=>"/redacted", "rack.request.query_string"=>"", "sinatra.route"=>"POST /redacted"}}
value   Datapoint value (Cleaned)
uuid    730ecebc-92e8-408e-91b6-e27c36499534

It now works (until the next time a node is saved, at which point it breaks again)

cheerfulstoic commented 7 years ago

Interesting... Looking at the property.rb file it seems like the attributes method returns a key which is nil (or at least where each_key has a nil in there somewhere) which would be very strange.

Is raw_input your property? Is it serialized?

Joshfindit commented 7 years ago

Here's the path:

  1. raw_input is a String property on the model, just like usual.
  2. classic sinatra app:

    def get_parsed_requestbody(response)
    parsed_response = JSON.parse(response)
    rescue JSON::ParserError => e  
    halt 400, JSON.generate({ status: 400, message: "Check your input." })
    end
post '/test' do
...
  parsedJSON[:body] = get_parsed_requestbody(request.body.read)
  parsedJSON[:header] = request.env.select { |k, v| v.is_a? String }
...
  datapoint.raw_input = parsedJSON
  datapoint.save
...
end
  1. Postman request using raw with JSON (see the headers above)
  2. Fail pulling Datapoint.all using the get '/datapoints' do or irb
cheerfulstoic commented 7 years ago

It seems like parsedJSON because you are setting keys on it, so do you need to do parsedJSON.to_json when setting raw_input so that it's a string? Alternatively you could use serialization:

http://neo4jrb.readthedocs.io/en/8.0.x/ActiveNode.html#serialization

Joshfindit commented 7 years ago

Definitely I was doing something wrong in there; I'm not looking for a fix for my specific issue (though serialization looks like the winner), I just recognized that messing around was basically a Naughty Strings test that uncovered a possible bug:

I was able to submit data that broke the system without getting any errors while doing it.

cheerfulstoic commented 7 years ago

Fair point! ;) . Thanks for the naughty report

Joshfindit commented 7 years ago

Just did it again on a different model and with different input using plain strings. While importing everything was fine, then calling Model.all exposed the problem.

Joshfindit commented 7 years ago

In this case, it's down to just this:

At a certain point, a node is created that breaks .all

Joshfindit commented 7 years ago

And here's a way I can reproduce it just in irb with only a few lines:

require 'json'
require 'jsonapi-serializers'
require 'neo4j'
require 'neo4j/core/cypher_session/adaptors/http'
require 'neo4j/core/cypher_session/adaptors/bolt'
require 'securerandom' # UUID
require 'rexml/document'
require 'plist'
require 'digest'
require 'open3'
require 'tempfile'
require 'sinatra/base'

NEO4J_URL = "bolt://neo4j:password@localhost:7687"

class JunkNode
  include Neo4j::ActiveNode

  property :name, type: String
end

def get_session
  neo4j_adaptor = Neo4j::Core::CypherSession::Adaptors::Bolt.new(NEO4J_URL, {wrap_level: :proc})
  Neo4j::ActiveBase.on_establish_session { Neo4j::Core::CypherSession.new(neo4j_adaptor) }

  # Enable debugging Neo4j
  Neo4j::ActiveBase.current_session.adaptor.logger.level = Logger::DEBUG #DEBUG/ERROR/WARN
  Neo4j::Core::CypherSession::Adaptors::Base.subscribe_to_query(&method(:puts))
  Neo4j::Core::CypherSession::Adaptors::Bolt.subscribe_to_request(&method(:puts))
end

def import_filelist(filename)
    File.open(filename).each_with_index do |line, index|
      if thisJunkNode = JunkNode.find_by(name: line.strip)
      puts "Found node already"
    else
      puts "Does not exist. Creating."
      thisJunkNode = JunkNode.new(name: line.strip)
      thisJunkNode.save
    end
    JunkNode.all[0]
  end
end

get_session
Neo4j::ActiveBase.current_session.query("CREATE CONSTRAINT ON (n:JunkNode) ASSERT n.uuid IS UNIQUE") # Only needs to be run once, but no harm running it multiple times so not wrapping it in a check.

import_filelist("./filelist2.txt")

filelist2.txt is simply generated by:

  1. doing find /path > ./filelist2.txt
  2. Replacing the mount point with the drive's GPT UUID (For example: /Volumes/Storage/file.txt becomes 77AADCB1-ABF7-38D9-8741-27990301DE1D/file.txt)

Example file: filelist.txt.zip Note: The path chosen to generate the file doesn't seem to matter

Joshfindit commented 7 years ago

This is a show-stopper, and I keep hitting it. Essentially, I find myself having to create a bunch of code around all the data entry functions so that I can test for 'database corruption', and it's such a huge increase in workload that it's stopping me from making significant progress.

Am I doing something the wrong way? Am I uncovering a bug that no one else cares about? Is this a sign that there's a more interesting language/stack that other people are using?

subvertallchris commented 7 years ago

Could you make a repo project containing code and instructions to reproduce? I know you have some code above but this would help me and possibly others get to it sooner.

On Mon, Feb 27, 2017 at 8:33 AM Josh notifications@github.com wrote:

This is a show-stopper, and I keep hitting it. Essentially, I find myself having to create a bunch of code around all the data entry functions so that I can test for 'database corruption', and it's such a huge increase in workload that it's stopping me from making significant progress.

Am I doing something the wrong way? Am I uncovering a bug that no one else cares about? Is this a sign that there's a more interesting language/stack that other people are using?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/neo4jrb/neo4j/issues/1354#issuecomment-282720692, or mute the thread https://github.com/notifications/unsubscribe-auth/AD6E93xXZ_Y3PC27STvqxV2Mnuh2DlzRks5rgtC-gaJpZM4L3imF .

Joshfindit commented 7 years ago

@subvertallchris For sure, thanks. Here you go: https://github.com/Joshfindit/neo4jrb-issue-report-1354

subvertallchris commented 7 years ago

It's been running for 25 minutes and no errors. How far into the import does it usually crash?

subvertallchris commented 7 years ago

It finished and then died with this:

/Users/cgrigg/.rvm/gems/ruby-2.2.3/gems/neo4j-core-7.0.5/lib/neo4j/core/cypher_session/adaptors/bolt.rb:47:in `query_set': Making query, but expected there to be no buffer remaining! (RuntimeError)

https://github.com/neo4jrb/neo4j/issues/1311 references that and https://github.com/neo4jrb/neo4j-core/pull/284 might have fixed it. I don't think this is at all related to your issue, though. Maybe create a Gemfile in this repo locked to the specific versions of your dependencies?

Joshfindit commented 7 years ago

@subvertallchris Usually it crashes within a minute.

While working with the issue, I could justify it as being a hiccup in message handling. I did update to the latest version and test it, the issue doesn't happen here.

The repo has been updated with specific gem versions; do you mind testing to see if the issue happens in your setup with the older version?