bartongroup / yanocomp

Yet another nanopore modification comparison tool
MIT License
11 stars 1 forks source link

ValueError: attempt to get argmax of an empty sequence #13

Open kwonej0617 opened 1 year ago

kwonej0617 commented 1 year ago

Hi, I have used yanocomp to detect RNA m6A in my data, but I got the error message in the log file, ValueError: attempt to get argmax of an empty sequence. The following is the command line I run.

#nanopolish eventalign
singularity exec /share/pkg/containers/nanopolish/0.14.0/nanopolish-0.14.0.sif nanopolish eventalign --scale-events --signal-index --print-read-names -r ${wt_fastq} -b ${wt_bam} -g ${ref} --threads 40 > wt_eventalign.txt
singularity exec /share/pkg/containers/nanopolish/0.14.0/nanopolish-0.14.0.sif nanopolish eventalign --scale-events --signal-index --print-read-names -r ${ko_fastq} -b ${ko_bam} -g ${ref} --threads 40 > ko_eventalign.txt

#prep
yanocomp prep -e wt_eventalign.txt -p 4 -h wt.hdf5
yanocomp prep -e ko_eventalign.txt -p 4 -h ko.hdf5

#gmmtest
yanocomp gmmtest -p 1 -c ko.hdf5 -t wt.hdf5 -o output.bed -s output_sm_preds.json.gz

I think that nanopolish eventalingn process was successfully done. wt_eventalign.txt

contig  position        reference_kmer  read_name       strand  event_index     event_level_mean        event_stdv      event_length    model_kmer      model_mean      model_stdv      standardized_level      start_idx       end_idx
ENST00000361390 1       TACCC   999e64e1-afac-4636-bee6-d47d92dab576    t       327     79.25   1.917   0.00398 TACCC   78.84   2.81    0.13    52632   52644
ENST00000361390 1       TACCC   999e64e1-afac-4636-bee6-d47d92dab576    t       328     72.59   0.383   0.00166 TACCC   78.84   2.81    -1.95   52627   52632
ENST00000361390 2       ACCCA   999e64e1-afac-4636-bee6-d47d92dab576    t       329     65.68   1.144   0.00697 ACCCA   66.30   2.07    -0.26   52606   52627
ENST00000361390 3       CCCAT   999e64e1-afac-4636-bee6-d47d92dab576    t       330     74.24   1.391   0.00631 CCCAT   73.36   2.11    0.37    52587   52606

ko_eventalign.txt

contig  position        reference_kmer  read_name       strand  event_index     event_level_mean        event_stdv      event_length    model_kmer      model_mean      model_stdv      standardized_level      start_idx       end_idx
ENST00000361390 1       TACCC   38cc1b08-a86d-4e1b-92d6-18e7506a3eee    t       310     78.13   2.096   0.00266 TACCC   78.84   2.81    -0.23   18946   18954
ENST00000361390 1       TACCC   38cc1b08-a86d-4e1b-92d6-18e7506a3eee    t       311     74.57   1.585   0.00232 TACCC   78.84   2.81    -1.37   18939   18946
ENST00000361390 2       ACCCA   38cc1b08-a86d-4e1b-92d6-18e7506a3eee    t       312     64.76   1.395   0.00299 ACCCA   66.30   2.07    -0.67   18930   18939
ENST00000361390 3       CCCAT   38cc1b08-a86d-4e1b-92d6-18e7506a3eee    t       313     71.24   2.687   0.00232 CCCAT   73.36   2.11    -0.91   18923   18930
ENST00000361390 4       CCATG   38cc1b08-a86d-4e1b-92d6-18e7506a3eee    t       314     82.43   2.061   0.00398 CCATG   82.78   2.13    -0.15   18911   18923

Also, here is the reference file.

>ENST00000542354
ATGTGGGGAGCTTTCCTTCTCTATGTTTCCATGAAGATGGGAGGCACTGCAGGACAAAGC
CTTGAGCAGCCCTCTGAAGTGACAGCTGTGGAAGGAGCCATTGTCCAGATAAACTGCACG
TACCAGACATCTGGGTTTTATGGGCTGTCCTGGTACCAGCAACATGATGGCGGAGCACCC
ACATTTCTTTCTTACAATGCTCTGGATGGTTTGGAGGAGACAGGTCGTTTTTCTTCATTC
CTTAGTCGCTCTGATAGTTATGGTTACCTCCTTCTACAGGAGCTCCAGATGAAAGACTCT
GCCTCTTACTTCTGCGCTGTGAGAGA
>ENST00000390423
ATGTGGGGAGTTTTCCTTCTTTATGTTTCCATGAAGATGGGAGGCACTACAGGACAAAAC
ATTGACCAGCCCACTGAGATGACAGCTACGGAAGGTGCCATTGTCCAGATCAACTGCACG
TACCAGACATCTGGGTTCAACGGGCTGTTCTGGTACCAGCAACATGCTGGCGAAGCACCT
ACATTTCTGTCTTACAATGTTCTGGATGGTTTGGAGGAGAAAGGTCGTTTTTCTTCATTC
CTTAGTCGGTCTAAAGGGTACAGTTACCTCCTTTTGAAGGAGCTCCAGATGAAAGACTCT
GCCTCTTACCTCTGTGCT

I am not sure the reason why I got the error message. Could you please help me fix the problem?

LEASE NOTE: This is an experimental module to load a centrally installed
miniconda environment. We do not recommend using this to install your own conda
environments; we do recommend installing miniconda in your /pi space to do so. 
2023-06-05 21:45:26,179 INFO     No GTF provided, records will be grouped by transcript
2023-06-06 01:46:18,838 INFO     No GTF provided, records will be grouped by transcript
2023-06-06 07:27:39,854 WARNING  Default min depth set to 6 to match window size 3
2023-06-06 07:27:39,860 INFO     Running gmmtest in 3-comp GMM (uniform outliers) mode with 1 control datasets and 1 treatment datasets
2023-06-06 07:28:12,823 INFO     22,587 genes to be processed on 1 workers
/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/stats.py:40: RuntimeWarning: invalid value encountered in true_divide
  cdf2 = (np.searchsorted(data2, pooled, side='right'))/(1.0 * n2)
/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/stats.py:48: RuntimeWarning: divide by zero encountered in double_scalars
  prob = stats.kstwobign.sf((en + 0.12 + 0.11 / en) * d)
/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/stats.py:40: RuntimeWarning: invalid value encountered in true_divide
  cdf2 = (np.searchsorted(data2, pooled, side='right'))/(1.0 * n2)
/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/stats.py:48: RuntimeWarning: divide by zero encountered in double_scalars
  prob = stats.kstwobign.sf((en + 0.12 + 0.11 / en) * d)
/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/stats.py:23: RuntimeWarning: Mean of empty slice.
  mu = X.mean(axis=0)
/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/numpy/core/_methods.py:181: RuntimeWarning: invalid value encountered in true_divi
de
  ret = um.true_divide(
/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/stats.py:24: RuntimeWarning: Mean of empty slice.
  sig = np.sqrt(((X - mu) ** 2.0).mean(0))
Traceback (most recent call last):
  File "/home/euijin.kwon-umw/miniconda3/envs/yanocomp/bin/yanocomp", line 8, in <module>
    sys.exit(cli())
  File "/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
return ctx.invoke(self.callback, **ctx.params)
  File "/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/opts.py", line 16, in _make_dataclass
    return cmd(dynamic_dataclass(cls_name, bases=bases, **cli_kwargs))
  File "/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/gmmtest.py", line 333, in gmm_test
    res, sm_preds = parallel_test(opts)
  File "/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/gmmtest.py", line 210, in parallel_test
    res, sm_preds = test_chunk(
  File "/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/gmmtest.py", line 164, in test_chunk
    was_tested, result, sm = position_stats(
  File "/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/stats.py", line 455, in position_stats
    r.ks_stat, ks_p_val = pca_kstest(
  File "/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/stats.py", line 71, in pca_kstest
    comps = pca_comp1(pooled)
  File "/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/stats.py", line 28, in pca_comp1
    i1 = np.argmax(s ** 2.0)
  File "<__array_function__ internals>", line 5, in argmax
  File "/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 1195, in argmax
    return _wrapfunc(a, 'argmax', axis=axis, out=out)
  File "/home/euijin.kwon-umw/miniconda3/envs/yanocomp/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc
    return bound(*args, **kwds)
ValueError: attempt to get argmax of an empty sequence

Thank you!