Open Joshfindit opened 7 years ago
Here's some info that may uncover the error as well:
I went through each node individually by UUID, and all were able to be called, but some came back with truncated properties.
From Neo4j:
name 172.28.0.1@1486310400.8555803
normalized_name 172.28.0.1@1486310400.8555803
raw_input {:body=>{"value"=>"Datapoint value 4 (Cleaned)"}, :header=>{"CONTENT_LENGTH"=>"42", "CONTENT_TYPE"=>"application/json", "GATEWAY_INTERFACE"=>"CGI/1.1", "PATH_INFO"=>"redacted", "QUERY_STRING"=>"", "REMOTE_ADDR"=>"172.28.0.1", "REMOTE_HOST"=>"172.28.0.1", "REQUEST_METHOD"=>"POST", "REQUEST_URI"=>"redacted", "SCRIPT_NAME"=>"", "SERVER_NAME"=>"127.0.0.1", "SERVER_PORT"=>"4455", "SERVER_PROTOCOL"=>"HTTP/1.1", "SERVER_SOFTWARE"=>"WEBrick/1.3.1 (Ruby/2.3.3/2016-11-21)", "HTTP_CACHE_CONTROL"=>"no-cache", "HTTP_POSTMAN_TOKEN"=>"redacted", "HTTP_AUTHORIZATION"=>"Token token=redacted", "HTTP_USER_AGENT"=>"PostmanRuntime/3.0.5", "HTTP_ACCEPT"=>"*/*", "HTTP_HOST"=>"127.0.0.1:4455", "HTTP_ACCEPT_ENCODING"=>"gzip, deflate", "HTTP_CONNECTION"=>"keep-alive", "rack.url_scheme"=>"http", "HTTP_VERSION"=>"HTTP/1.1", "REQUEST_PATH"=>"redacted", "rack.request.query_string"=>"", "sinatra.route"=>"POST redacted"}}
value Datapoint value 4 (Cleaned)
uuid 8e19341e-bfa8-4bc5-b827-bc169847e219
The result of Datapoint.find("8e19341e-bfa8-4bc5-b827-bc169847e219")
Datapoint
MATCH (n:`Datapoint`)
WHERE (n.uuid = {n_uuid})
RETURN n
ORDER BY n.uuid
LIMIT {limit_1} | {:n_uuid=>"8e19341e-bfa8-4bc5-b827-bc169847e219", :limit_1=>1}
=> #<Datapoint uuid: "8e19341e-bfa8-4bc5-b827-bc169847e219", name: "172.28.0.1@1486310400.8555803", normalized_name: "172.28.0.1@1486310400.8555803", raw_input: "{:body=>{\"value\"=>\"Datapoint value 4 (Cleaned)\"}, :header=>{\"CONTENT_LENGTH\"=>\"42\", \"CONTENT_TYPE\"=>\"", value: "Datapoint value 4 (Cleaned)">
Destroyed the nodes starting with the newest and working back until Datapoint.all worked.
It did not work until the following node was destroyed
name 172.28.0.1@1486310376.4773538
normalized_name 172.28.0.1@1486310376.4773538
raw_input {:body=>{"value"=>"Datapoint value (Cleaned)"}, :header=>{"CONTENT_LENGTH"=>"40", "CONTENT_TYPE"=>"application/json", "GATEWAY_INTERFACE"=>"CGI/1.1", "PATH_INFO"=>"/redacted", "QUERY_STRING"=>"", "REMOTE_ADDR"=>"172.28.0.1", "REMOTE_HOST"=>"172.28.0.1", "REQUEST_METHOD"=>"POST", "REQUEST_URI"=>"redacted", "SCRIPT_NAME"=>"", "SERVER_NAME"=>"127.0.0.1", "SERVER_PORT"=>"redacted", "SERVER_PROTOCOL"=>"HTTP/1.1", "SERVER_SOFTWARE"=>"WEBrick/1.3.1 (Ruby/2.3.3/2016-11-21)", "HTTP_CACHE_CONTROL"=>"no-cache", "HTTP_POSTMAN_TOKEN"=>"redacted", "HTTP_AUTHORIZATION"=>"Token token=redacted", "HTTP_USER_AGENT"=>"PostmanRuntime/3.0.5", "HTTP_ACCEPT"=>"*/*", "HTTP_HOST"=>"redacted", "HTTP_ACCEPT_ENCODING"=>"gzip, deflate", "HTTP_CONNECTION"=>"keep-alive", "rack.url_scheme"=>"http", "HTTP_VERSION"=>"HTTP/1.1", "REQUEST_PATH"=>"/redacted", "rack.request.query_string"=>"", "sinatra.route"=>"POST /redacted"}}
value Datapoint value (Cleaned)
uuid 730ecebc-92e8-408e-91b6-e27c36499534
It now works (until the next time a node is saved, at which point it breaks again)
Interesting... Looking at the property.rb
file it seems like the attributes
method returns a key which is nil
(or at least where each_key
has a nil
in there somewhere) which would be very strange.
Is raw_input
your property? Is it serialized?
Here's the path:
raw_input
is a String
property on the model, just like usual.classic sinatra app:
def get_parsed_requestbody(response)
parsed_response = JSON.parse(response)
rescue JSON::ParserError => e
halt 400, JSON.generate({ status: 400, message: "Check your input." })
end
post '/test' do
...
parsedJSON[:body] = get_parsed_requestbody(request.body.read)
parsedJSON[:header] = request.env.select { |k, v| v.is_a? String }
...
datapoint.raw_input = parsedJSON
datapoint.save
...
end
Datapoint.all
using the get '/datapoints' do
or irb
It seems like parsedJSON
because you are setting keys on it, so do you need to do parsedJSON.to_json
when setting raw_input
so that it's a string? Alternatively you could use serialization:
http://neo4jrb.readthedocs.io/en/8.0.x/ActiveNode.html#serialization
Definitely I was doing something wrong in there; I'm not looking for a fix for my specific issue (though serialization looks like the winner), I just recognized that messing around was basically a Naughty Strings test that uncovered a possible bug:
I was able to submit data that broke the system without getting any errors while doing it.
Fair point! ;) . Thanks for the naughty report
Just did it again on a different model and with different input using plain strings.
While importing everything was fine, then calling Model.all
exposed the problem.
In this case, it's down to just this:
name
propertyModel.new(name: line.strip)
)At a certain point, a node is created that breaks .all
And here's a way I can reproduce it just in irb
with only a few lines:
require 'json'
require 'jsonapi-serializers'
require 'neo4j'
require 'neo4j/core/cypher_session/adaptors/http'
require 'neo4j/core/cypher_session/adaptors/bolt'
require 'securerandom' # UUID
require 'rexml/document'
require 'plist'
require 'digest'
require 'open3'
require 'tempfile'
require 'sinatra/base'
NEO4J_URL = "bolt://neo4j:password@localhost:7687"
class JunkNode
include Neo4j::ActiveNode
property :name, type: String
end
def get_session
neo4j_adaptor = Neo4j::Core::CypherSession::Adaptors::Bolt.new(NEO4J_URL, {wrap_level: :proc})
Neo4j::ActiveBase.on_establish_session { Neo4j::Core::CypherSession.new(neo4j_adaptor) }
# Enable debugging Neo4j
Neo4j::ActiveBase.current_session.adaptor.logger.level = Logger::DEBUG #DEBUG/ERROR/WARN
Neo4j::Core::CypherSession::Adaptors::Base.subscribe_to_query(&method(:puts))
Neo4j::Core::CypherSession::Adaptors::Bolt.subscribe_to_request(&method(:puts))
end
def import_filelist(filename)
File.open(filename).each_with_index do |line, index|
if thisJunkNode = JunkNode.find_by(name: line.strip)
puts "Found node already"
else
puts "Does not exist. Creating."
thisJunkNode = JunkNode.new(name: line.strip)
thisJunkNode.save
end
JunkNode.all[0]
end
end
get_session
Neo4j::ActiveBase.current_session.query("CREATE CONSTRAINT ON (n:JunkNode) ASSERT n.uuid IS UNIQUE") # Only needs to be run once, but no harm running it multiple times so not wrapping it in a check.
import_filelist("./filelist2.txt")
filelist2.txt is simply generated by:
find /path > ./filelist2.txt
/Volumes/Storage/file.txt
becomes 77AADCB1-ABF7-38D9-8741-27990301DE1D/file.txt
)Example file: filelist.txt.zip Note: The path chosen to generate the file doesn't seem to matter
This is a show-stopper, and I keep hitting it. Essentially, I find myself having to create a bunch of code around all the data entry functions so that I can test for 'database corruption', and it's such a huge increase in workload that it's stopping me from making significant progress.
Am I doing something the wrong way? Am I uncovering a bug that no one else cares about? Is this a sign that there's a more interesting language/stack that other people are using?
Could you make a repo project containing code and instructions to reproduce? I know you have some code above but this would help me and possibly others get to it sooner.
On Mon, Feb 27, 2017 at 8:33 AM Josh notifications@github.com wrote:
This is a show-stopper, and I keep hitting it. Essentially, I find myself having to create a bunch of code around all the data entry functions so that I can test for 'database corruption', and it's such a huge increase in workload that it's stopping me from making significant progress.
Am I doing something the wrong way? Am I uncovering a bug that no one else cares about? Is this a sign that there's a more interesting language/stack that other people are using?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/neo4jrb/neo4j/issues/1354#issuecomment-282720692, or mute the thread https://github.com/notifications/unsubscribe-auth/AD6E93xXZ_Y3PC27STvqxV2Mnuh2DlzRks5rgtC-gaJpZM4L3imF .
@subvertallchris For sure, thanks. Here you go: https://github.com/Joshfindit/neo4jrb-issue-report-1354
It's been running for 25 minutes and no errors. How far into the import does it usually crash?
It finished and then died with this:
/Users/cgrigg/.rvm/gems/ruby-2.2.3/gems/neo4j-core-7.0.5/lib/neo4j/core/cypher_session/adaptors/bolt.rb:47:in `query_set': Making query, but expected there to be no buffer remaining! (RuntimeError)
https://github.com/neo4jrb/neo4j/issues/1311 references that and https://github.com/neo4jrb/neo4j-core/pull/284 might have fixed it. I don't think this is at all related to your issue, though. Maybe create a Gemfile in this repo locked to the specific versions of your dependencies?
@subvertallchris Usually it crashes within a minute.
While working with the issue, I could justify it as being a hiccup in message handling. I did update to the latest version and test it, the issue doesn't happen here.
The repo has been updated with specific gem versions; do you mind testing to see if the issue happens in your setup with the older version?
Apologies for the poor quality of this issue submission: This came up on a dev branch with a bunch of changes. If this gives you any pointers for info that I can submit to help debug, let me know.
Command that causes the error
Class.all
Expected result
An array containing all nodes from that class
Actual result
Note
Other classes are perfectly fine, as are other docker containers with the same app and different data.
Steps that I believe caused this error
I was only watching the return values for a while, and when I went back to list all the nodes, this error came up.
Seems like I was able to submit data in to Neo4j that could not be read again.
Runtime information:
Neo4j database version:
neo4j
gem version: 8.0.6neo4j-core
gem version: 7.0.3