Closed wososa closed 3 years ago
Hi, these errors aren't a real problem, they just mean that certain reads get discarded due to high number of invalid kmers. Does Nanocompore complete?
Closed because inactive
Dear Nanocompore developer,
Nanocompore has been running for a very long time. The last step has been 5 days. should I wait for it to finish?
2021-01-25T23:58:51.227901-0500 ERROR - MainProcess | High fraction of invalid kmers (38.95%) for read 8cb14954-dc85-44fe-aa0f-e7cd65055099 2021-01-25T23:58:51.228206-0500 ERROR - MainProcess | High fraction of invalid kmers (23.8%) for read 1e236488-381a-4c29-ba3e-513231becc18 2021-01-25T23:58:54.423011-0500 INFO - MainProcess | References found in index: 61621 2021-01-25T23:58:55.055914-0500 INFO - MainProcess | Filtering out references with low coverage 2021-01-25T23:58:58.913828-0500 INFO - MainProcess | References remaining after reference coverage filtering: 23286 2021-01-25T23:59:04.278296-0500 INFO - MainProcess | Starting data processing 2021-02-06T04:33:46.735469-0500 INFO - Process-3 | All Done. Transcripts processed: 23286 2021-02-06T04:33:48.124521-0500 INFO - MainProcess | Loading SampCompDB 2021-02-06T04:33:49.913571-0500 INFO - MainProcess | Calculate results
Thanks, Woody
Describe the bug After data preparation step (https://nanocompore.rna.rocks/data_preparation/), I ran sampcomp. However, I found a lot of ERRORs.
To Reproduce nanocompore sampcomp \ --file_list1 testrun1_eventalign_collapsed_reads.tsv/out_eventalign_collapse.tsv \ --file_list2 testrun2_eventalign_collapsed_reads.tsv/out_eventalign_collapse.tsv \ --label1 testrun1 \ --label2 testrun2 \ --fasta mouse_transcriptome.fa \ --outpath testrun1_v_testrun2
Expected behavior Shouldn't have errors.
Screenshots WARNING - MainProcess | Running SampComp INFO - MainProcess | Checking and initialising SampComp DEBUG - MainProcess | package_name: nanocompore DEBUG - MainProcess | package_version: 1.0.2 DEBUG - MainProcess | timestamp: 2020-12-30 10:45:54.156630 DEBUG - MainProcess | progress: False DEBUG - MainProcess | nthreads: 22 DEBUG - MainProcess | exclude_ref_id: [] DEBUG - MainProcess | select_ref_id: [] DEBUG - MainProcess | max_invalid_kmers_freq: 0.1 DEBUG - MainProcess | downsample_high_coverage: 5000 DEBUG - MainProcess | min_ref_length: 100 DEBUG - MainProcess | min_coverage: 30 DEBUG - MainProcess | sequence_context_weights: uniform DEBUG - MainProcess | sequence_context: 0 DEBUG - MainProcess | allow_warnings: False DEBUG - MainProcess | anova: False DEBUG - MainProcess | logit: True DEBUG - MainProcess | comparison_methods: GMM,KS DEBUG - MainProcess | overwrite: False DEBUG - MainProcess | outprefix: out DEBUG - MainProcess | outpath: testrun1_v_testrun2 DEBUG - MainProcess | fasta_fn: mouse_transcriptome.fa INFO - MainProcess | Only 1 replicate found for condition testrun1 INFO - MainProcess | This is not recommended. The statistics will be calculated with the logit method INFO - MainProcess | Only 1 replicate found for condition testrun2 INFO - MainProcess | This is not recommended. The statistics will be calculated with the logit method DEBUG - MainProcess | OrderedDict([('testrun1', {'testrun1_1': 'testrun1_eventalign_collapsed_reads.tsv/out_eventalign_colla INFO - MainProcess | Reading eventalign index files ERROR - MainProcess | High fraction of invalid kmers (122.42%) for read aee62692-c826-40f7-81e3-264b9bcc8e74 ERROR - MainProcess | High fraction of invalid kmers (104.15%) for read 1a1aecb7-d796-4d19-ae69-439aa17c723e ERROR - MainProcess | High fraction of invalid kmers (70.94%) for read 270cc13e-9696-4c8f-aa87-89cdf9945eaa ERROR - MainProcess | High fraction of invalid kmers (21.26%) for read f7957b1b-dae1-4a28-a7e9-30be2bd657f9 ERROR - MainProcess | High fraction of invalid kmers (112.8%) for read 9d2744a5-fd9a-4e35-bfd9-0725cf56989a ERROR - MainProcess | High fraction of invalid kmers (11.13%) for read d7a3d57c-3097-4cc1-81af-7d48a65c250e ERROR - MainProcess | High fraction of invalid kmers (10.16%) for read 4381a43f-c44a-4f70-a5e7-baa121d461da ERROR - MainProcess | High fraction of invalid kmers (10.33%) for read e1c346f6-f52d-4163-96ec-e7964f135c17 ERROR - MainProcess | High fraction of invalid kmers (12.69%) for read cee22501-abec-4279-a25a-0007488353e0