LUMC / rna_cd

Detect RNA contamination in human Illumina DNA-seq experiments
GNU Affero General Public License v3.0
2 stars 0 forks source link

how to identify the percentage of RNA contamination in DNA-Seq? #8

Open jingydz opened 9 months ago

jingydz commented 9 months ago

when I set the contaminated (“positive”) and uncontamined (“negative”) groups (the first is my sample that may be contaminated, the second is public 1kgp sample)

$ cat positives.list /xxx/bam_list/mysample.bam $ cat negatives.list /xxx/bam_list/NA12878.bam

and then run rna_cd-train -c chrM -pl positives.list -nl negatives.list -j 3 \ --chunksize 100 -o model.json --plot-out pca.png

but I meet the error

Start time is 2023/10/19--11:05 [ 2023-10-19 03:05:59.771258 ] Calculating features for mysample.bam [ 2023-10-19 03:05:59.771483 ] Calculating features for NA12878.bam [ 2023-10-19 03:10:22.019794 ] Done calculating features for mysample.bam [ 2023-10-19 03:12:44.830715 ] Done calculating features for NA12878.bam [ 2023-10-19 03:12:45.702813 ] Setting up processing pipeline for SVM model [ 2023-10-19 03:12:45.708970 ] Starting grid search for SVC model with 3 cross validations Traceback (most recent call last): File "/xxx/software/miniconda3/bin/rna_cd-train", line 8, in sys.exit(train_cli()) File "/xxx/software/miniconda3/lib/python3.9/site-packages/click/core.py", line 1128, in call return self.main(args, kwargs) File "/xxx/software/miniconda3/lib/python3.9/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/xxx/software/miniconda3/lib/python3.9/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, ctx.params) File "/xxx/software/miniconda3/lib/python3.9/site-packages/click/core.py", line 754, in invoke return __callback(args, **kwargs) File "/xxx/software/miniconda3/lib/python3.9/site-packages/rna_cd/cli.py", line 138, in train_cli model = train_svm_model(positives, negatives, chunksize=chunksize, File "/xxx/software/miniconda3/lib/python3.9/site-packages/rna_cd/models.py", line 157, in train_svm_model searcher.fit(arr_X, arr_Y) File "/xxx/.local/lib/python3.9/site-packages/sklearn/model_selection/_search.py", line 875, in fit self._run_search(evaluate_candidates) File "/xxx/.local/lib/python3.9/site-packages/sklearn/model_selection/_search.py", line 1375, in _run_search evaluate_candidates(ParameterGrid(self.param_grid)) File "/xxx/.local/lib/python3.9/site-packages/sklearn/model_selection/_search.py", line 125, in init raise ValueError( ValueError: Parameter grid for parameter 'reduce_dim__n_components' need to be a non-empty sequence, got: [] Finish time is 2023/10/19--11:12

jingydz commented 9 months ago

when I changed the chrM into chr21. I got this error.

Traceback (most recent call last): File "", line 1, in File "/xxx/.local/lib/python3.9/site-packages/joblib/init.py", line 120, in from .parallel import Parallel File "/xxx/.local/lib/python3.9/site-packages/joblib/parallel.py", line 26, in from ._parallel_backends import (FallbackToBackend, MultiprocessingBackend, File "/xxx/.local/lib/python3.9/site-packages/joblib/_parallel_backends.py", line 17, in from .pool import MemmappingPool File "/xxx/.local/lib/python3.9/site-packages/joblib/pool.py", line 40, in import numpy as np File "/xxx/software/miniconda3/lib/python3.9/site-packages/numpy/init.py", line 147, in from . import lib File "/xxx/software/miniconda3/lib/python3.9/site-packages/numpy/lib/init.py", line 44, in all += type_check.all NameError: name 'type_check' is not defined

jingydz commented 9 months ago

-pl positives.list -nl negatives.list how many bam files are needed in the list?