vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.29k stars 589 forks source link

import vaex results in Segfault #522

Closed pedro-alonsod closed 4 years ago

pedro-alonsod commented 4 years ago

Hi,

I have just installed vaex via pip and just wehn tryin to import it it crashes.

faulthandler.enable(); import vaex; print('import completed');

Results in this:

`Fatal Python error: Segmentation fault

Current thread 0x00007ff6cffab700 (most recent call first): File "", line 219 in _call_with_frames_removed File "", line 1043 in create_module File "", line 583 in module_from_spec File "", line 670 in _load_unlocked File "", line 967 in _find_and_load_unlocked File "", line 983 in _find_and_load File "/nfs/staff/pedalo/.local/lib/python3.7/site-packages/vaex/strings.py", line 7 in File "", line 219 in _call_with_frames_removed File "", line 728 in exec_module File "", line 677 in _load_unlocked File "", line 967 in _find_and_load_unlocked File "", line 983 in _find_and_load File "/nfs/staff/pedalo/.local/lib/python3.7/site-packages/vaex/column.py", line 9 in File "", line 219 in _call_with_frames_removed File "", line 728 in exec_module File "", line 677 in _load_unlocked File "", line 967 in _find_and_load_unlocked File "", line 983 in _find_and_load File "/nfs/staff/pedalo/.local/lib/python3.7/site-packages/vaex/utils.py", line 24 in File "", line 219 in _call_with_frames_removed File "", line 728 in exec_module File "", line 677 in _load_unlocked File "", line 967 in _find_and_load_unlocked File "", line 983 in _find_and_load File "/nfs/staff/pedalo/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 17 in File "", line 219 in _call_with_frames_removed File "", line 728 in exec_module File "", line 677 in _load_unlocked File "", line 967 in _find_and_load_unlocked File "", line 983 in _find_and_load File "/nfs/staff/pedalo/.local/lib/python3.7/site-packages/vaex/init.py", line 39 in File "", line 219 in _call_with_frames_removed File "", line 728 in exec_module File "", line 677 in _load_unlocked File "", line 967 in _find_and_load_unlocked File "", line 983 in _find_and_load File "wordnetGraphHD.py", line 84 in Segmentation fault (core dumped)`

Any clues?

JovanVeljanoski commented 4 years ago

Hi @pedro-alonsod

Can you provide some more information:

Thanks! Jovan.

pedro-alonsod commented 4 years ago

Hi Jovan,

Sure I can:

OS: Ubuntu Server 16.04 Conda: Can't install via this, I'm not a sudoer. Imports: `#%% import os

1 imports & func

import nltk

nltk.download('all')

from nltk.corpus import wordnet as wn from nltk.stem import WordNetLemmatizer wordnet_lemmatizer = WordNetLemmatizer()

import yawlib

from yawlib.glosswordnet import Gloss

from yawlib import YLConfig

from yawlib import WordNetSQL

ywn = WordNetSQL(YLConfig.WORDNET_30_PATH)

from nltk.corpus import stopwords #this to spacy stop words

get_ipython().run_line_magic('matplotlib', 'inline')

import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D

import faulthandler import gc import nltk import re import random import string from scipy.ndimage.interpolation import shift stop_words = set(stopwords.words("english")) resdef = wn.synset('ocean.n.01').definition() print(resdef) from string import punctuation import os import sklearn from sklearn import datasets import importlib import numpy as np import pprint as pp

from sklearn.preprocessing import MultiLabelBinarizer

nltk.download('reuters')

import keras import pandas as pd from scipy.spatial import distance import numpy as np from sklearn.metrics.pairwise import cosine_similarity from sklearn.preprocessing import LabelEncoder from sklearn.decomposition import PCA from sklearn.manifold import TSNE from sklearn.neural_network import MLPClassifier from sklearn.metrics import classification_report from sklearn.metrics import confusion_matrix from sklearn.svm import SVC

import csv import spacy import seaborn as sns from math import sqrt from random import seed from random import randrange

from neupy import algorithms, utils

########MySQL stuff import mysql.connector from mysql.connector import Error

Copy to ditto the mutable vectors

import copy

Manage larg data ~20GB

faulthandler.enable()

import vaex

print('import completed')`

I will try 2.5.0, thanks. :D

JovanVeljanoski commented 4 years ago

Can you quickly try to import just vaex in your session without anything else, just so we know if there is some incompatibility with some other package/library or maybe the problem is with the installation.

pedro-alonsod commented 4 years ago

Yes there is not, with just: import vaex print('imported')

It successfully imported. No complains. I'm not sure what to make of this. Thanks. :D

pedro-alonsod commented 4 years ago

Reinstalling it did nothing, same SegFault:

`Fatal Python error: Segmentation fault

Current thread 0x00007f177bf56700 (most recent call first): File "", line 219 in _call_with_frames_removed File "", line 1043 in create_module File "", line 583 in module_from_spec File "", line 670 in _load_unlocked File "", line 967 in _find_and_load_unlocked File "", line 983 in _find_and_load File "/nfs/staff/pedalo/.local/lib/python3.7/site-packages/vaex/strings.py", line 7 in File "", line 219 in _call_with_frames_removed File "", line 728 in exec_module File "", line 677 in _load_unlocked File "", line 967 in _find_and_load_unlocked File "", line 983 in _find_and_load File "/nfs/staff/pedalo/.local/lib/python3.7/site-packages/vaex/column.py", line 9 in File "", line 219 in _call_with_frames_removed File "", line 728 in exec_module File "", line 677 in _load_unlocked File "", line 967 in _find_and_load_unlocked File "", line 983 in _find_and_load File "/nfs/staff/pedalo/.local/lib/python3.7/site-packages/vaex/utils.py", line 24 in File "", line 219 in _call_with_frames_removed File "", line 728 in exec_module File "", line 677 in _load_unlocked File "", line 967 in _find_and_load_unlocked File "", line 983 in _find_and_load File "/nfs/staff/pedalo/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 17 in File "", line 219 in _call_with_frames_removed File "", line 728 in exec_module File "", line 677 in _load_unlocked File "", line 967 in _find_and_load_unlocked File "", line 983 in _find_and_load File "/nfs/staff/pedalo/.local/lib/python3.7/site-packages/vaex/init.py", line 39 in File "", line 219 in _call_with_frames_removed File "", line 728 in exec_module File "", line 677 in _load_unlocked File "", line 967 in _find_and_load_unlocked File "", line 983 in _find_and_load File "wordnetGraphHD.py", line 84 in Segmentation fault (core dumped)`

JovanVeljanoski commented 4 years ago

Hi,

If vaex imports successfully in a session when you don't import any other packages, the problem could be conflicting/incompatible dependencies between vaex and some other package that you are trying to use. A similar known issue happen when trying to import vaex and tensorflow-nightly (but the stable tensorflow version works alright #514 ).

It might be useful to (via trial and error ) figure out which package/dependency is causing this, by importing vaex together with the other packages in stages.

JovanVeljanoski commented 4 years ago

(Closed because stale - please re-open if needed)