Open louyuenan opened 1 year ago
Hello, can you share the downloaded dataset with me, I can't get on Yahoo, thank you very much.
I meet the same problem when I run something like "init_instance_by_config(dataset_config)"
Of course this problem is caused by old version multiprocess library can not control more than 60 logic cores, I "fixed" this problem by disabled intel hyper thread, but now initialisation is super slow, hope one day fix it or add gpu init support.
🐛 Bug Description
When I was tring to normalize the 1min data, use following code, (env38) C:\Users\Anani>python scripts/data_collector/yahoo/collector.py normalize_data --qlib_data_1d_dir ~/.qlib/qlib_data/cn_data --source_dir ~/.qlib/stock_data/source/cn_data_1min --normalize_dir ~/.qlib/stock_data/source/cn_1min_nor --region CN --interval 1min --max_workers 8
I got this ERROR code, 2023-02-22 21:54:27.234 | INFO | data_collector.utils:get_calendar_list:106 - end of get calendar list: ALL. [6924:MainThread](2023-02-22 21:55:01,958) INFO - qlib.Initialization - [config.py:416] - default_conf: client. [6924:MainThread](2023-02-22 21:55:02,858) INFO - qlib.Initialization - [init.py:74] - qlib successfully initialized based on client settings. [6924:MainThread](2023-02-22 21:55:02,859) INFO - qlib.Initialization - [init.py:76] - data_path={'__DEFAULT_FREQ': WindowsPath('C:/Users/Anani/.qlib/qlib_data/cn_data')} Exception in thread Thread-1: Traceback (most recent call last): File "C:\Users\Anani\anaconda3\envs\env38\lib\threading.py", line 932, in _bootstrap_inner self.run() File "C:\Users\Anani\anaconda3\envs\env38\lib\threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "C:\Users\Anani\anaconda3\envs\env38\lib\multiprocessing\pool.py", line 519, in _handle_workers cls._wait_for_updates(current_sentinels, change_notifier) File "C:\Users\Anani\anaconda3\envs\env38\lib\multiprocessing\pool.py", line 499, in _wait_for_updates wait(sentinels, timeout=timeout) File "C:\Users\Anani\anaconda3\envs\env38\lib\multiprocessing\connection.py", line 879, in wait ready_handles = _exhaustive_wait(waithandle_to_obj.keys(), timeout) File "C:\Users\Anani\anaconda3\envs\env38\lib\multiprocessing\connection.py", line 811, in _exhaustive_wait res = _winapi.WaitForMultipleObjects(L, False, timeout) ValueError: need at most 63 handles, got a sequence of length 72
Environment
Windows
,Linux
,MacOS
): win10 LTSC 1809Additional Notes
I have learned that problem may be caused by the CPU cores >60, Is it true? if so, how do I limited the cpu cores when do this project? SIncerely,