XiaoTaoWang / EagleC

A deep-learning framework for predicting a full range of structural variations from bulk and single-cell contact maps
Other
51 stars 8 forks source link

Inquiry for KeyError:'sweight' #29

Open Sanghyun-WaxingMoon opened 1 year ago

Sanghyun-WaxingMoon commented 1 year ago

Hello, I am Sanghyun trying to make use of EagleC.

I am just an end user of bioinformatics.

When using predictSV, I encountered an error described below.

I'm asking here because I couldn't find a similar case by referring to other topics in the issue.

I tried cooler balance, but it did not work.

I would be very grateful if you could let me know how I should begin my approach to resolve the issue.

(EagleC) sanghyun@ubuntu:/data4/sanghyun/micro-c/chuna$ predictSV --hic-5k C.mcool::/resolutions/5000 --hic-10k C.mcool::/resolutions/10000 --hic-50k C.mcool::/resolutions/50000 -O C.eagle -g other --balance-type CNV --output-format full --prob-cutoff-5k 0.8 --prob-cutoff-10k 0.8 --prob-cutoff-50k 0.99999 root INFO @ 07/21/23 09:44:38:

ARGUMENT LIST:

Cool URI at 5kb = C.mcool::/resolutions/5000

Cool URI at 10kb = C.mcool::/resolutions/10000

Cool URI at 50kb = C.mcool::/resolutions/50000

Balance Type = CNV

Reference Genome = other

Included Chromosomes = ['#', 'X']

Probability Cutoff for 5kb SVs = 0.8

Probability Cutoff for 10kb SVs = 0.8

Probability Cutoff for 50kb SVs = 0.99999

Output File Prefix = C.eagle

Output Format = full

Log file name = C.eagle.log

root INFO @ 07/21/23 09:44:38: Predict SVs at 5kb resolution ... numexpr.utils INFO @ 07/21/23 09:44:41: Note: detected 256 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable. numexpr.utils INFO @ 07/21/23 09:44:41: Note: NumExpr detected 256 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8. numexpr.utils INFO @ 07/21/23 09:44:41: NumExpr defaulting to 8 threads. root INFO @ 07/21/23 09:44:52: matched sequencing depth in human at 10Kb: 260375696.9216156 root INFO @ 07/21/23 09:44:52: Load CNN models from /data2/sanghyun/miniconda3/envs/EagleC/lib/python3.8/site-packages/eaglec/data/bulk/200M-300M ... 2023-07-21 09:44:52.291873: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set 2023-07-21 09:44:52.308725: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-07-21 09:44:52.342923: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance. root INFO @ 07/21/23 09:44:56: Done root INFO @ 07/21/23 09:44:56: Interemediate results at the 5kb resolution will be cached to .C.mcool.218585564.CNV.None.100000.None eaglec.scoreUtils INFO @ 07/21/23 09:44:56: (1, 1): someone else is working on it, skip eaglec.scoreUtils INFO @ 07/21/23 09:44:56: (10, 10): someone else is working on it, skip eaglec.scoreUtils INFO @ 07/21/23 09:44:56: (11, 11): someone else is working on it, skip eaglec.scoreUtils INFO @ 07/21/23 09:44:56: (12, 12): someone else is working on it, skip Traceback (most recent call last): File "/data2/sanghyun/miniconda3/envs/EagleC/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3802, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 165, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 5745, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 5753, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'sweight'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/data2/sanghyun/miniconda3/envs/EagleC/bin/predictSV-single-resolution", line 276, in run() File "/data2/sanghyun/miniconda3/envs/EagleC/bin/predictSV-single-resolution", line 227, in run intra_expected_count = intraPredict(clr, cnn_models, chroms, cache_folder, seq_depth, File "eaglec/scoreUtils.pyx", line 1263, in eaglec.scoreUtils.intraPredict File "eaglec/scoreUtils.pyx", line 861, in eaglec.scoreUtils._intra_global_core File "/data2/sanghyun/miniconda3/envs/EagleC/lib/python3.8/site-packages/pandas/core/frame.py", line 3807, in getitem indexer = self.columns.get_loc(key) File "/data2/sanghyun/miniconda3/envs/EagleC/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3804, in get_loc raise KeyError(key) from err KeyError: 'sweight' Traceback (most recent call last): File "/data2/sanghyun/miniconda3/envs/EagleC/bin/predictSV", line 176, in run() File "/data2/sanghyun/miniconda3/envs/EagleC/bin/predictSV", line 112, in run subprocess.check_call(' '.join(command), shell=True) File "/data2/sanghyun/miniconda3/envs/EagleC/lib/python3.8/subprocess.py", line 364, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'predictSV-single-resolution -H C.mcool::/resolutions/5000 --balance-type CNV -O C.eagle.CNN_SVs.5K.txt --genome other --output-format full -C "#" "X" --prob-cutoff 0.8 --logFile C.eagle.log' returned non-zero exit status 1.

XiaoTaoWang commented 1 year ago

Hi, if you want to use the CNV-normalized contact signals for SV detection, you will need to first perform the CNV normalization using NeoLoopFinder. Please refer to the Quick Start section for more details.

Sanghyun-WaxingMoon commented 1 year ago

Thank you so much. I missed the obvious instruction. Now I got another error during neoloop, but struggle to fix it. Thank you