networkgeometry / mercator

Inference of high-quality embeddings of complex networks into the hyperbolic disk.
GNU General Public License v3.0
45 stars 11 forks source link

Reduce to LCC bug under Archlinux #3

Open jmon12 opened 2 years ago

jmon12 commented 2 years ago

The problem

Using mercator on a file edges.edges following the edgefile format convention where the LCC is not the whole network as mercator edges.edges, the following behavior is to be observed, where the pattern leading to one or the other hasn't been identified:

  1. Almost the expected behavior: the expected message is printed, but the core is aborted:
    
    More than one component found (996/998 vertices in the largest component.
    Edges belonging to the largest component saved to edges_GC.edge. Please rerun the program using this new edgelist.

terminate called without an active exception Aborted (core dumped)


2. **The dangerous behavior**: the first part of the expected message is printed: `More than one component found (996/998 vertices in the largest component.`. Then **_the program doesn't terminate_** and fills up the file `edges_GC.edge` at a rate of ~ 1GB/sec with what seems to be a single line of whitespaces. It needs to be killed and can fill up the disk within a few seconds.

# Versions and software

Note that the current master branch of mercator has been used, commit 14d4aaccc3cfbe30e1ecb84bb16af735105759d0. I could reproduce the issue with:
- Archlinux with kernel 5.10.16-arch1-1 and gcc 10.2.0
- Archlinux with kernel 5.15.4-arch1-1 and gcc 11.1.0

# Investigation

- Some attempts didn't lead to the same problem under OSX (with which compiler @antoineallard ?)
- Not yet able to estimate the probability to get the dangerous behavior
guoguo616 commented 8 months ago

For anyone coming here to solve the problem, just remove the last line break from the last line.

antoineallard commented 8 months ago

Thank you @jmon12 for reporting this bug and to @guoguo616 for the solution. I apologize for the lack of reply as I guess I missed the original message...

@guoguo616 Would you mind submitting a pull request with the correction? Or specify the number of the line we need to remove? Thank you so much!

guoguo616 commented 8 months ago

Thank you for the reply, @antoineallard . ,I'm sorry I don't know how to fix the error in C++ because I only have very limited exp with it. I suspect the problem is around embeddingS1.hpp line 1860~1862, we should skip both empty (or only blank chars) lines and lines starting with "#" here

antoineallard commented 8 months ago

Ah ok, I see. You meant removing the last line break of the last line of the .edges file?

guoguo616 commented 8 months ago

Yes, "removing the last line break of the last line of the .edges file" is how I work around the problem, though I highly suspect any blank line will cause the problem, though I didn't manage to test it out, though I am not sure about the readline flow operator actually behaves in different OS. It would be so nice of you to test and handle both the blank line and trailing line-switcher case. Thank you again.