Closed lintool closed 1 month ago
That's weird. Just rerun this test on orca and it passed
Done!
5191286it [00:07, 688446.36it/s]
100%|███████████████████████████████████████████████████████████████████| 5193/5193 [00:01<00:00, 4988.37it/s]
.
----------------------------------------------------------------------
Ran 1 test in 4216.341s
OK
Seems to be a macOS problem... I'm trying to debug.
@stephaniewhoo do you have access to a macOS machine you can try also?
Seems to be related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92061
This seems to be a known issue and doesn't appear to have been resolved yet: https://github.com/microsoft/LightGBM/issues/4229
I also get it worked on orca. Will try on my macos machine too.
Yes, confirmed that the test case passes on orca
.
score_tie occurs 208854 times in 5188 queries
recall@10:0.0
recall@20:0.0
recall@50:0.0
recall@100:0.0
recall@200:0.0
recall@250:0.0
recall@300:0.0
recall@333:0.0
recall@400:0.0
recall@500:0.0
recall@1000:0.0
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5193/5193 [00:16<00:00, 312.44it/s]
score_tie occurs 208854 times in 5188 queries
Done!
5191286it [00:07, 669016.70it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5193/5193 [00:01<00:00, 4888.41it/s]
.
----------------------------------------------------------------------
Ran 1 test in 10265.931s
OK
Finally the test is complete on my laptop (macOS) and it passes as well. My system is also MacOS Monterey 12.1 (21C52)
Same as Stephanie, the test also passed on my macos machine, my system is macOS Big Sur version 11.5.2
Trying out the example here: https://github.com/microsoft/LightGBM/issues/4229
% python
Python 3.8.12 (default, Oct 12 2021, 06:23:56)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lightgbm import LGBMClassifier
>>> import numpy as np
>>> from concurrent.futures import ThreadPoolExecutor
>>>
>>> x = np.random.random((200, 4))
>>> y = x.sum(axis=1) >= 2
>>>
>>>
>>> def myfunc(a=7):
... test = LGBMClassifier().fit(x, y)
... print(test.predict(x))
...
>>>
>>> with ThreadPoolExecutor(20) as tpe:
... print(list(tpe.map(myfunc, range(20))))
...
zsh: segmentation fault python
Indeed, I get a seg fault - this is on macOS 12.1.
Additional details:
% pip list | grep lightgbm
lightgbm 3.3.2
% brew info libomp
libomp: stable 13.0.0 (bottled)
LLVM's OpenMP runtime library
https://openmp.llvm.org/
/usr/local/Cellar/libomp/13.0.0 (9 files, 1.6MB) *
Poured from bottle on 2022-01-12 at 21:15:03
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/libomp.rb
License: MIT
==> Dependencies
Build: cmake ✘
==> Analytics
install: 52,037 (30 days), 228,792 (90 days), 1,152,680 (365 days)
install-on-request: 7,733 (30 days), 31,763 (90 days), 140,940 (365 days)
build-error: 10 (30 days)
I have tested the above script on my macOS with macOS version 11.5.2, libomp version 13.0.0 and lightgbm 3.2.2 . No segmentation fault is found, everything works just fine. The reason for the test failure was narrowed down to the version of macOS.
Waiting for upstream fixes. No further action for now.
I just updated to libomp 14.0.0 via brew
. Issue still persists.
Update, still having this issue:
% python -m unittest integrations.sparse.test_lucenesearcher_check_ltr_msmarco_document.TestLtrMsmarcoDocument
...
recall@10:0.0
recall@20:0.0
recall@50:0.0
recall@100:0.0
recall@200:0.0
recall@500:0.0
recall@1000:0.0
recall@2000:0.0
recall@5000:0.0
recall@10000:0.0
Attempting to initialize pre-built index msmarco-doc-per-passage-ltr.
/Users/jimmylin/.cache/pyserini/indexes/index-msmarco-doc-per-passage-ltr-20211031-33e4151.bd60e89041b4ebbabc4bf0cfac608a87 already exists, skipping download.
Initializing msmarco-doc-per-passage-ltr...
Attempting to initialize pre-built index msmarco-doc-per-passage-ltr.
/Users/jimmylin/.cache/pyserini/indexes/index-msmarco-doc-per-passage-ltr-20211031-33e4151.bd60e89041b4ebbabc4bf0cfac608a87 already exists, skipping download.
Initializing msmarco-doc-per-passage-ltr...
analyzed contents
text_unlemm text_unlemm
text_bert_tok text_bert_tok
IBM model Load takes 14.59 seconds
IBM model Load takes 32.02 seconds
IBM model Load takes 315.10 seconds
IBM model Load takes 58.74 seconds
#
[thread 53763 also had an error]
[thread 78855 also had an error]
[thread 54019 also had an error]
[thread 76035 also had an error]
[thread 51207 also had an error]
[thread 75523 also had an error]
[thread 77827 also had an error]
[thread 51463 also had an error][thread 77059 also had an error]
[thread 52995 also had an error]# A fatal error has been detected by the Java Runtime Environment:
[thread 52739 also had an error][thread 77571 also had an error]
[thread 51715 also had an error]
[thread 77315 also had an error]
[thread 50951 also had an error][thread 78087 also had an error][thread 78599 also had an error]
[thread 50695 also had an error]
[thread 51971 also had an error][thread 50439 also had an error]
[thread 52483 also had an error]
[thread 53251 also had an error]
[thread 76291 also had an error][thread 76803 also had an error][thread 53507 also had an error]
[thread 76547 also had an error][thread 52227 also had an error]
#
# SIGSEGV (0xb)[thread 75267 also had an error] at pc=0x0000000104abbffe
, pid=24960, tid=78343
#
# JRE version: Java(TM) SE Runtime Environment (11.0.4+10) (build 11.0.4+10-LTS)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (11.0.4+10-LTS, mixed mode, tiered, compressed oops, g1 gc, bsd-amd64)
# Problematic frame:
# [thread 71171 also had an error]
[thread 70659 also had an error]
[thread 70147 also had an error]
[thread 57347 also had an error]
[thread 69891 also had an error]
[thread 69635 also had an error]
C [libomp.dylib+0x60ffe] __kmp_suspend_initialize_thread+0x1e
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/jimmylin/workspace/pyserini/hs_err_pid24960.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
#
/Users/jimmylin/opt/anaconda3/envs/pyserini-dev/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
Traceback (most recent call last):
File "tools/scripts/msmarco/msmarco_doc_eval.py", line 235, in <module>
main(parser.parse_args())
File "tools/scripts/msmarco/msmarco_doc_eval.py", line 222, in main
metrics = compute_metrics_from_files(path_to_reference, path_to_candidate, exclude_qids)
File "tools/scripts/msmarco/msmarco_doc_eval.py", line 184, in compute_metrics_from_files
qids_to_ranked_candidate_documents = load_candidate(path_to_candidate)
File "tools/scripts/msmarco/msmarco_doc_eval.py", line 98, in load_candidate
with autoopen(path_to_candidate,'r') as f:
File "tools/scripts/msmarco/msmarco_doc_eval.py", line 28, in autoopen
return open(filename, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'ltr_test/run.ltr.msmarco-pass-doc.test.trec'
E
======================================================================
ERROR: test_reranking (integrations.sparse.test_lucenesearcher_check_ltr_msmarco_document.TestLtrMsmarcoDocument)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jimmylin/workspace/pyserini/integrations/sparse/test_lucenesearcher_check_ltr_msmarco_document.py", line 50, in test_reranking
result = subprocess.check_output(f'python tools/scripts/msmarco/msmarco_doc_eval.py --judgments tools/topics-and-qrels/qrels.msmarco-doc.dev.txt --run ltr_test/{outp}', shell=True).decode(sys.stdout.encoding)
File "/Users/jimmylin/opt/anaconda3/envs/pyserini-dev/lib/python3.8/subprocess.py", line 415, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/Users/jimmylin/opt/anaconda3/envs/pyserini-dev/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'python tools/scripts/msmarco/msmarco_doc_eval.py --judgments tools/topics-and-qrels/qrels.msmarco-doc.dev.txt --run ltr_test/run.ltr.msmarco-pass-doc.test.trec' returned non-zero exit status 1.
----------------------------------------------------------------------
Ran 1 test in 4061.513s
FAILED (errors=1)
And:
% brew info libomp
libomp: stable 14.0.0 (bottled)
LLVM's OpenMP runtime library
Even with:
export LIBOMP_USE_HIDDEN_HELPER_TASK=0
export LIBOMP_NUM_HIDDEN_HELPER_THREADS=0
Per https://github.com/microsoft/LightGBM/issues/4229#issuecomment-930614380 - didn't help.
Trying this again:
% brew info libomp
==> libomp: stable 14.0.6 (bottled)
LLVM's OpenMP runtime library
Still getting same error.
Interestingly, on the M1 chip, lightgbm
does work with the following install command:
conda install -c conda-forge lightgbm
Works fine on M series processors... simply solution, avoid x86 on Mac ;)
Test failure on my iMac Pro, macOS Monterrey 12.1... any ideas?