ssbc / ssb-friends

Manages the SSB social graph
MIT License
23 stars 6 forks source link

unhanging tests #30

Closed mixmix closed 5 years ago

mixmix commented 5 years ago

this fixes #29 for me

I'll be honest, I don't really understand how and when to use setImmediate ... but this works ? If someone has an idea of why sbot not quite having time to close cleanly was causing tape to get hung up I'd love to understand more.

dominictarr commented 5 years ago

it doesn't make sense that this works but the other way didn't.

Does it print the test output and just not exit? this sounds like a mystery for why-is-node-running

mixmix commented 5 years ago

that's a sweet tool. Here's what I see (try branch why-is-node to reproduce)

There are 7 handle(s) keeping the process running

# Timeout
/home/mix/projects/SSBC/ssb-friends/node_modules/ssb-db/db.js:50        - var timer = setInterval(update, 200)
/home/mix/projects/SSBC/ssb-friends/node_modules/ssb-db/index.js:25     - var db = require('./db')(join(opts.path || path, 'flume'), keys, opts)
/home/mix/projects/SSBC/ssb-friends/node_modules/ssb-db/create.js:7     - return SSB(
/home/mix/projects/SSBC/ssb-friends/node_modules/ssb-server/index.js:51 - var ssb = create(path.join(opts.path, 'db'), opts, opts.keys)

# TCPSERVERWRAP
(unknown stack trace)

# Timeout
/home/mix/projects/SSBC/ssb-friends/node_modules/ssb-db/db.js:50        - var timer = setInterval(update, 200)
/home/mix/projects/SSBC/ssb-friends/node_modules/ssb-db/index.js:25     - var db = require('./db')(join(opts.path || path, 'flume'), keys, opts)
/home/mix/projects/SSBC/ssb-friends/node_modules/ssb-db/create.js:7     - return SSB(
/home/mix/projects/SSBC/ssb-friends/node_modules/ssb-server/index.js:51 - var ssb = create(path.join(opts.path, 'db'), opts, opts.keys)

# TCPSERVERWRAP
(unknown stack trace)

# Timeout
/home/mix/projects/SSBC/ssb-friends/node_modules/ssb-db/db.js:50        - var timer = setInterval(update, 200)
/home/mix/projects/SSBC/ssb-friends/node_modules/ssb-db/index.js:25     - var db = require('./db')(join(opts.path || path, 'flume'), keys, opts)
/home/mix/projects/SSBC/ssb-friends/node_modules/ssb-db/create.js:7     - return SSB(
/home/mix/projects/SSBC/ssb-friends/node_modules/ssb-server/index.js:51 - var ssb = create(path.join(opts.path, 'db'), opts, opts.keys)

# TCPSERVERWRAP
(unknown stack trace)

# Timeout
/home/mix/projects/SSBC/ssb-friends/test/friends.js:283              - setTimeout(log, 5e3)
/home/mix/projects/SSBC/ssb-friends/node_modules/tape/lib/test.js:64 - return val.apply(self, arguments);
/home/mix/projects/SSBC/ssb-friends/node_modules/tape/lib/test.js:83 - this._cb(this);
/home/mix/projects/SSBC/ssb-friends/node_modules/tape/lib/test.js:64 - return val.apply(self, arguments);
dominictarr commented 5 years ago

now we just have to figure out what it means. @arj03 do you have any clues?

I'm pretty sure that it's the servers and not setInterval (since they all have timer.unref && timer.unref())

@mixmix does it still happen when you run just one test at a time? I notice you've added this to all the tests - I can imagine that it somehow happens in test/friends which creates multilpe sbots, but test/sbot.js only creates a single sbot. If that one is hanging something really weird is happening. If it is only test/friends.js try disabling the tests within it so to see if it still hangs when you just run one, or otherwise figure out which one is hanging by a process of elimination

mixmix commented 5 years ago

yeah it still happens with just one tape test on one file being run.

I think the problem is the way we're doing startup and shutdown of the server. The "cleanup" test is a hack which is not working. If you add setTimeout(ssbServer.close, 5e3) then everything closes down completely happily. (ALSO, the node logger still mentions the setInterval is running in this happy case, so it's not the problem)

mixmix commented 5 years ago

It's tempting to make a verbose sbot harness which just works ... but we need a generalised solution so heaps of people don't fall in this hole. I notice with this sort of test I use scuttle-testbot and I think I've used tape-plus for it's beforeEach and afterEach hooks. That's worked a charm for me

dominictarr commented 5 years ago

@mixmix what node version are you using? what os? debian?

If you just shift close to be after t.end() does that work?

t('cleanup', function (t) {
  t.end()
  sbot.close()
})

It should work as it is... it does work for me and on travis too It is conceivable that not enough other people are running the tests (I know I run them but I don't really know)

Does anyone else have a problem? @christianbundy

dominictarr commented 5 years ago

also have you had this problem with any other ssb-* tests? which others have you been running?

mixmix commented 5 years ago

node 8.11.3 kubuntu (so yes debian base)

t('cleanup', function (t) {
  t.end()
  sbot.close()
})

does not work


I know this is annoying. I want to clarify that I'm trying to get tests running so I can write docs and add tests if it's helpful. Till I can run tests I don't want to touch anything (of the code).

dominictarr commented 5 years ago

Yes I understand that - but the way I see it you have already found a problem. As a policy we shouldn't make changes that we don't understand. If we made changes like that we'd have a lot more problems.

you didn't answer my other question: which other tests have you been running?

does the simplest possible test fail?

var tape = require('tape')
var createSbot = require('ssb-server')
//no plugins!
//  .use(require('ssb-replicate'))
//  .use(require('..'))

var sbot = createSbot({
  temp: 'alice',
  port: 45451,
  host: 'localhost',
})

tape('close immediately', function (t) {
  sbot.close()
  t.end()
})

ssb-friends doesn't actually touch close... this is probably really a problem in ssb-db or secret-stack or flumedb.

dominictarr commented 5 years ago

btw, I don't think the problem is tape. removing t.end() makes tape error for me... I mean, tape doesn't block node from exiting.

what versions do you have?

$ npm ls ssb-db ssb-server flumedb
ssb-friends@3.1.12 /home/dominic/c/ssb-friends
`-- ssb-server@13.6.1 
  +-- ssb-db@18.6.5 
  | `-- flumedb@1.0.4 
  `-- ssb-ooo@1.1.0
    `-- flumedb@1.0.4  deduped
dominictarr commented 5 years ago

hmm, unless it's something completely silly (which it could be, given that it only fails for you) is that there is something that on your system hasn't finished opening when close is called. It's weird that it would work within setImmediate though... because that is quite a narrow band of time.

arj03 commented 5 years ago

Ill have a look this evening

christianbundy commented 5 years ago

I'm not having any problems with npm test on master, but it's possible that our dependencies are wonky. I've pushed a branch called with-lockfile that has a full dependency tree that works for me. You should be able to install the exact deps with npm ci and run npm test.

This should either resolve the problem or eliminate dependencies mismatch as a possible issue. Hope this helps!

arj03 commented 5 years ago

I remember I went through all the code of ssb-server and modules used a few months back looking for these timer unref as they can keep the server running and fixed one. After that I didn't have a problem with ssb-server hanging anymore.

I'm not able to reproduce the hang here, node 10.11.

mixmix commented 5 years ago

https://github.com/ssbc/ssb-friends/pull/30#issuecomment-460033313 :

yeah it still happens with just one tape test on one file being run.

i.e. yes I did test it with only one test in one file


does the simplest possible test fail?

That simple test passed. I kept adding complexity until I got it failing. It's when there are 2 "real" tests, and the "cleanup" test that I start seeing the hang. Just one "real" test is fine....

var tape = require('tape')
var Keys = require('ssb-keys')
var createSbot = require('ssb-server')
  .use(require('ssb-replicate'))
  .use(require('..'))

var sbot = createSbot({
  temp: 'alice',
  port: 45451,
  host: 'localhost'
})

tape('group', t => {
  t.test('run another test', function (st) {
    const follow = {
      type: 'contact',
      contact: Keys.generate().id,
      following: true
    }
    sbot.publish(follow, (err, data) => {
      st.false(err)
      st.ok(data, JSON.stringify(data.value.content))
      st.equal(2 + 2, 4)
      st.end()
    })
  })

  // adding this second t.test stimulates the hang for me
  t.test('run another TEST', function (st) {
    const follow = {
      type: 'contact',
      contact: Keys.generate().id,
      following: true
    }
    sbot.publish(follow, (err, data) => {
      st.false(err)
      st.ok(data, JSON.stringify(data.value.content))
      st.equal(2 + 2, 4)
      st.end()
    })
  })

  t.test('close immediately', function (st) {
    sbot.close()
    st.end()
  })

  t.end()
})

Versions:

mixmix SSBC/ssb-friends »  npm ls ssb-db ssb-server flumedb                         
ssb-friends@3.1.12 /home/mix/projects/SSBC/ssb-friends
└─┬ ssb-server@13.6.1 
  ├─┬ ssb-db@18.6.5 
  │ └── flumedb@1.0.4 
  └─┬ ssb-ooo@1.1.0
    └── flumedb@1.0.4  deduped
mixmix commented 5 years ago

I've pushed a branch called with-lockfile

THIS WORKS @christianbundy !!

Can I request other people nuke their node_modules and install fresh on the master branch to see if they can get this failing state :bomb:

mixmix commented 5 years ago

OH GOD. I just nuked and installed the master branch and everything is passing for me now.

I fee like I'm losing my mind

It seems like the npm gods just felt like throwing poop at me yesterday? I think that means this is a non-issue. I'm really sorry to waste everyones time if that's the case

dominictarr commented 5 years ago

npm is haunted, it's not your fault but you do need to chase the spirits away by rm -rf node_modules

btw, the first thing npm ci does is rm -rf node_modules so that happened when you installed from the lockfile.

stale[bot] commented 5 years ago

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?