Class.all gives "NoMethodError: undefined method `to_sym' for nil:NilClass" (Probably data entry error)

Joshfindit commented 7 years ago

Apologies for the poor quality of this issue submission: This came up on a dev branch with a bunch of changes. If this gives you any pointers for info that I can submit to help debug, let me know.

Command that causes the error

Class.all

Expected result

An array containing all nodes from that class

Actual result

NoMethodError: undefined method `to_sym' for nil:NilClass
Did you mean?  to_s
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/property.rb:22:in `association_key?'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/property.rb:29:in `block in contains_association?'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/property.rb:29:in `each_key'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/property.rb:29:in `contains_association?'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/property.rb:15:in `extract_association_attributes!'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/initialize.rb:11:in `init_on_load'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/node_wrapper.rb:9:in `block (2 levels) in <top (required)>'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/node_wrapper.rb:8:in `tap'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/node_wrapper.rb:8:in `block in <top (required)>'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/wrappable.rb:27:in `wrap'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/wrappable.rb:11:in `wrap'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/responses/bolt.rb:95:in `wrap_by_level'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/responses/bolt.rb:102:in `wrap_node'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/responses/bolt.rb:75:in `wrap_structure'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/responses/bolt.rb:41:in `wrap_entity'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/responses/bolt.rb:39:in `map'
... 12 levels...
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/adaptors.rb:117:in `block in queries'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/adaptors.rb:142:in `transaction'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/adaptors.rb:201:in `new_or_current_transaction'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/adaptors.rb:116:in `queries'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session/adaptors.rb:108:in `query'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j/core/cypher_session.rb:36:in `block (2 levels) in <class:CypherSession>'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j-core/query.rb:240:in `response'
    from /usr/local/bundle/gems/neo4j-core-7.0.3/lib/neo4j-core/query.rb:311:in `pluck'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/query/query_proxy_enumerable.rb:88:in `pluck'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/query/query_proxy_enumerable.rb:26:in `result'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/query/query_proxy_enumerable.rb:13:in `each'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/query/query_proxy_eager_loading.rb:6:in `each'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/query/query_proxy.rb:193:in `to_a'
    from /usr/local/bundle/gems/neo4j-8.0.6/lib/neo4j/active_node/query/query_proxy.rb:193:in `[]'
    from (irb):1
    from /usr/local/bin/irb:11:in `<main>'

Note

Other classes are perfectly fine, as are other docker containers with the same app and different data.

Steps that I believe caused this error

Created a small JSON endpoint to work with creating nodes (get posted data, save node)
Iteratively modified both the endpoint and the request to do proactive bug testing, changing headers, post data, and post data structure.
Started saving the request headers to the nodes

I was only watching the return values for a while, and when I went back to list all the nodes, this error came up.

Seems like I was able to submit data in to Neo4j that could not be read again.

Runtime information:

Neo4j database version: neo4j gem version: 8.0.6 neo4j-core gem version: 7.0.3

Joshfindit commented 7 years ago

Here's some info that may uncover the error as well:

I went through each node individually by UUID, and all were able to be called, but some came back with truncated properties.

From Neo4j:

name    172.28.0.1@1486310400.8555803
normalized_name 172.28.0.1@1486310400.8555803
raw_input   {:body=>{"value"=>"Datapoint value 4 (Cleaned)"}, :header=>{"CONTENT_LENGTH"=>"42", "CONTENT_TYPE"=>"application/json", "GATEWAY_INTERFACE"=>"CGI/1.1", "PATH_INFO"=>"redacted", "QUERY_STRING"=>"", "REMOTE_ADDR"=>"172.28.0.1", "REMOTE_HOST"=>"172.28.0.1", "REQUEST_METHOD"=>"POST", "REQUEST_URI"=>"redacted", "SCRIPT_NAME"=>"", "SERVER_NAME"=>"127.0.0.1", "SERVER_PORT"=>"4455", "SERVER_PROTOCOL"=>"HTTP/1.1", "SERVER_SOFTWARE"=>"WEBrick/1.3.1 (Ruby/2.3.3/2016-11-21)", "HTTP_CACHE_CONTROL"=>"no-cache", "HTTP_POSTMAN_TOKEN"=>"redacted", "HTTP_AUTHORIZATION"=>"Token token=redacted", "HTTP_USER_AGENT"=>"PostmanRuntime/3.0.5", "HTTP_ACCEPT"=>"*/*", "HTTP_HOST"=>"127.0.0.1:4455", "HTTP_ACCEPT_ENCODING"=>"gzip, deflate", "HTTP_CONNECTION"=>"keep-alive", "rack.url_scheme"=>"http", "HTTP_VERSION"=>"HTTP/1.1", "REQUEST_PATH"=>"redacted", "rack.request.query_string"=>"", "sinatra.route"=>"POST redacted"}}
value   Datapoint value 4 (Cleaned)
uuid    8e19341e-bfa8-4bc5-b827-bc169847e219

The result of Datapoint.find("8e19341e-bfa8-4bc5-b827-bc169847e219")

Datapoint
 MATCH (n:`Datapoint`)
 WHERE (n.uuid = {n_uuid})
 RETURN n
 ORDER BY n.uuid
 LIMIT {limit_1} | {:n_uuid=>"8e19341e-bfa8-4bc5-b827-bc169847e219", :limit_1=>1}
=> #<Datapoint uuid: "8e19341e-bfa8-4bc5-b827-bc169847e219", name: "172.28.0.1@1486310400.8555803", normalized_name: "172.28.0.1@1486310400.8555803", raw_input: "{:body=>{\"value\"=>\"Datapoint value 4 (Cleaned)\"}, :header=>{\"CONTENT_LENGTH\"=>\"42\", \"CONTENT_TYPE\"=>\"", value: "Datapoint value 4 (Cleaned)">

Joshfindit commented 7 years ago

Destroyed the nodes starting with the newest and working back until Datapoint.all worked.

It did not work until the following node was destroyed

name    172.28.0.1@1486310376.4773538
normalized_name 172.28.0.1@1486310376.4773538
raw_input   {:body=>{"value"=>"Datapoint value (Cleaned)"}, :header=>{"CONTENT_LENGTH"=>"40", "CONTENT_TYPE"=>"application/json", "GATEWAY_INTERFACE"=>"CGI/1.1", "PATH_INFO"=>"/redacted", "QUERY_STRING"=>"", "REMOTE_ADDR"=>"172.28.0.1", "REMOTE_HOST"=>"172.28.0.1", "REQUEST_METHOD"=>"POST", "REQUEST_URI"=>"redacted", "SCRIPT_NAME"=>"", "SERVER_NAME"=>"127.0.0.1", "SERVER_PORT"=>"redacted", "SERVER_PROTOCOL"=>"HTTP/1.1", "SERVER_SOFTWARE"=>"WEBrick/1.3.1 (Ruby/2.3.3/2016-11-21)", "HTTP_CACHE_CONTROL"=>"no-cache", "HTTP_POSTMAN_TOKEN"=>"redacted", "HTTP_AUTHORIZATION"=>"Token token=redacted", "HTTP_USER_AGENT"=>"PostmanRuntime/3.0.5", "HTTP_ACCEPT"=>"*/*", "HTTP_HOST"=>"redacted", "HTTP_ACCEPT_ENCODING"=>"gzip, deflate", "HTTP_CONNECTION"=>"keep-alive", "rack.url_scheme"=>"http", "HTTP_VERSION"=>"HTTP/1.1", "REQUEST_PATH"=>"/redacted", "rack.request.query_string"=>"", "sinatra.route"=>"POST /redacted"}}
value   Datapoint value (Cleaned)
uuid    730ecebc-92e8-408e-91b6-e27c36499534

It now works (until the next time a node is saved, at which point it breaks again)

cheerfulstoic commented 7 years ago

Interesting... Looking at the property.rb file it seems like the attributes method returns a key which is nil (or at least where each_key has a nil in there somewhere) which would be very strange.

Is raw_input your property? Is it serialized?

Joshfindit commented 7 years ago

Here's the path:

raw_input is a String property on the model, just like usual.

classic sinatra app:

def get_parsed_requestbody(response)
parsed_response = JSON.parse(response)
rescue JSON::ParserError => e  
halt 400, JSON.generate({ status: 400, message: "Check your input." })
end

post '/test' do
...
  parsedJSON[:body] = get_parsed_requestbody(request.body.read)
  parsedJSON[:header] = request.env.select { |k, v| v.is_a? String }
...
  datapoint.raw_input = parsedJSON
  datapoint.save
...
end

Postman request using raw with JSON (see the headers above)
Fail pulling Datapoint.all using the get '/datapoints' do or irb

cheerfulstoic commented 7 years ago

It seems like parsedJSON because you are setting keys on it, so do you need to do parsedJSON.to_json when setting raw_input so that it's a string? Alternatively you could use serialization:

http://neo4jrb.readthedocs.io/en/8.0.x/ActiveNode.html#serialization

Joshfindit commented 7 years ago

Definitely I was doing something wrong in there; I'm not looking for a fix for my specific issue (though serialization looks like the winner), I just recognized that messing around was basically a Naughty Strings test that uncovered a possible bug:

I was able to submit data that broke the system without getting any errors while doing it.

cheerfulstoic commented 7 years ago

Fair point! ;) . Thanks for the naughty report

Joshfindit commented 7 years ago

Just did it again on a different model and with different input using plain strings. While importing everything was fine, then calling Model.all exposed the problem.

Joshfindit commented 7 years ago

In this case, it's down to just this:

Create a model with just a name property
Read lines from a text file
Create a node with the name as the line and save it (Model.new(name: line.strip))

At a certain point, a node is created that breaks .all

Joshfindit commented 7 years ago

And here's a way I can reproduce it just in irb with only a few lines:

require 'json'
require 'jsonapi-serializers'
require 'neo4j'
require 'neo4j/core/cypher_session/adaptors/http'
require 'neo4j/core/cypher_session/adaptors/bolt'
require 'securerandom' # UUID
require 'rexml/document'
require 'plist'
require 'digest'
require 'open3'
require 'tempfile'
require 'sinatra/base'

NEO4J_URL = "bolt://neo4j:password@localhost:7687"

class JunkNode
  include Neo4j::ActiveNode

  property :name, type: String
end

def get_session
  neo4j_adaptor = Neo4j::Core::CypherSession::Adaptors::Bolt.new(NEO4J_URL, {wrap_level: :proc})
  Neo4j::ActiveBase.on_establish_session { Neo4j::Core::CypherSession.new(neo4j_adaptor) }

  # Enable debugging Neo4j
  Neo4j::ActiveBase.current_session.adaptor.logger.level = Logger::DEBUG #DEBUG/ERROR/WARN
  Neo4j::Core::CypherSession::Adaptors::Base.subscribe_to_query(&method(:puts))
  Neo4j::Core::CypherSession::Adaptors::Bolt.subscribe_to_request(&method(:puts))
end

def import_filelist(filename)
    File.open(filename).each_with_index do |line, index|
      if thisJunkNode = JunkNode.find_by(name: line.strip)
      puts "Found node already"
    else
      puts "Does not exist. Creating."
      thisJunkNode = JunkNode.new(name: line.strip)
      thisJunkNode.save
    end
    JunkNode.all[0]
  end
end

get_session
Neo4j::ActiveBase.current_session.query("CREATE CONSTRAINT ON (n:JunkNode) ASSERT n.uuid IS UNIQUE") # Only needs to be run once, but no harm running it multiple times so not wrapping it in a check.

import_filelist("./filelist2.txt")

filelist2.txt is simply generated by:

doing find /path > ./filelist2.txt
Replacing the mount point with the drive's GPT UUID (For example: /Volumes/Storage/file.txt becomes 77AADCB1-ABF7-38D9-8741-27990301DE1D/file.txt)

Example file: filelist.txt.zip Note: The path chosen to generate the file doesn't seem to matter

Joshfindit commented 7 years ago

This is a show-stopper, and I keep hitting it. Essentially, I find myself having to create a bunch of code around all the data entry functions so that I can test for 'database corruption', and it's such a huge increase in workload that it's stopping me from making significant progress.

Am I doing something the wrong way? Am I uncovering a bug that no one else cares about? Is this a sign that there's a more interesting language/stack that other people are using?

subvertallchris commented 7 years ago

Could you make a repo project containing code and instructions to reproduce? I know you have some code above but this would help me and possibly others get to it sooner.

On Mon, Feb 27, 2017 at 8:33 AM Josh notifications@github.com wrote:

This is a show-stopper, and I keep hitting it. Essentially, I find myself having to create a bunch of code around all the data entry functions so that I can test for 'database corruption', and it's such a huge increase in workload that it's stopping me from making significant progress.

Am I doing something the wrong way? Am I uncovering a bug that no one else cares about? Is this a sign that there's a more interesting language/stack that other people are using?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/neo4jrb/neo4j/issues/1354#issuecomment-282720692, or mute the thread https://github.com/notifications/unsubscribe-auth/AD6E93xXZ_Y3PC27STvqxV2Mnuh2DlzRks5rgtC-gaJpZM4L3imF .

Joshfindit commented 7 years ago

@subvertallchris For sure, thanks. Here you go: https://github.com/Joshfindit/neo4jrb-issue-report-1354

subvertallchris commented 7 years ago

It's been running for 25 minutes and no errors. How far into the import does it usually crash?

subvertallchris commented 7 years ago

It finished and then died with this:

/Users/cgrigg/.rvm/gems/ruby-2.2.3/gems/neo4j-core-7.0.5/lib/neo4j/core/cypher_session/adaptors/bolt.rb:47:in `query_set': Making query, but expected there to be no buffer remaining! (RuntimeError)

https://github.com/neo4jrb/neo4j/issues/1311 references that and https://github.com/neo4jrb/neo4j-core/pull/284 might have fixed it. I don't think this is at all related to your issue, though. Maybe create a Gemfile in this repo locked to the specific versions of your dependencies?

Joshfindit commented 7 years ago

@subvertallchris Usually it crashes within a minute.

While working with the issue, I could justify it as being a hiccup in message handling. I did update to the latest version and test it, the issue doesn't happen here.

The repo has been updated with specific gem versions; do you mind testing to see if the issue happens in your setup with the older version?

neo4jrb / activegraph