gaoyubang / nanom6A

MIT License
18 stars 11 forks source link

ERROR: Requested column 1, but database file - only has fields 1 - 0. (predict_sites.py ) #8

Open Derryxu opened 3 years ago

Derryxu commented 3 years ago

I'm trying to predict the m6A site of Arabidopsis DRS data with nanom6A, but I received the following message while performing "predict_sites". ! What files should I check for? or any other information should I provide? Thank you very much!

#######################################3 2.start mapping [M::mm_idx_gen::7.6291.09] collected minimizers [M::mm_idx_gen::11.9201.20] sorted minimizers [M::main::11.9201.20] loaded/built the index for 7 target sequence(s) [M::mm_mapopt_update::12.6121.19] mid_occ = 87 [M::mm_idx_stat] kmer size: 14; skip: 5; is_hpc: 0; #seq: 7 [M::mm_idx_stat::13.148*1.17] distinct minimizers: 20096618 (62.37% are singletons); average occurrences: 2.019; average spacing: 2.949 [M::main] Version: 2.17-r941 [M::main] CMD: minimap2 --secondary=no -ax splice -uf -k14 -t 20 data/Arabidopsis_thaliana.TAIR10.dna.toplevel.fa result.feature.fa [M::main] Real time: 13.242 sec; CPU: 15.442 sec; Peak RSS: 1.737 GB Could not build fai index result.feature.fa.fai gene annotation


***** ERROR: Requested column 1, but database file - only has fields 1 - 0. [INFO][SAMSequenceDictionaryProgress]done: N=0 parse bam 3.m6A site to genome sites 0it [00:00, ?it/s] mkdir: cannot create directory ‘plot_nano_plot’: File exists 0it [00:00, ?it/s]

Derryxu commented 3 years ago

And I received the following 2 warnnings while performing "predict_sites"... 2.start mapping [M::mm_idx_gen::3.7101.30] collected minimizers [M::mm_idx_gen::4.2812.67] sorted minimizers [M::main::4.2812.67] loaded/built the index for 7 target sequence(s) [M::mm_mapopt_update::4.7312.51] mid_occ = 87 [M::mm_idx_stat] kmer size: 14; skip: 5; is_hpc: 0; #seq: 7 [M::mm_idx_stat::4.9772.44] distinct minimizers: 20096618 (62.37% are singletons); average occurrences: 2.019; average spacing: 2.949 [M::worker_pipeline::5.2593.05] mapped 3630 sequences [M::main] Version: 2.17-r941 [M::main] CMD: minimap2 --secondary=no -ax splice -uf -k14 -t 20 data/Arabidopsis_thaliana.TAIR10.dna.toplevel.fa result.feature.fa [M::main] Real time: 5.306 sec; CPU: 16.089 sec; Peak RSS: 2.051 GB gene annotation ***** WARNING: File data/Araport11_GFF3_genes_transposons.201606.6.bed has inconsistent naming convention for record: Chr1 3630 3759 AT1G01010:five_prime_UTR:1 . +

***** WARNING: File data/Araport11_GFF3_genes_transposons.201606.6.bed has inconsistent naming convention for record: Chr1 3630 3759 AT1G01010:five_prime_UTR:1 . +


*** ERROR: Requested column 10, but database file - only has fields 1 - 0. [INFO][SAMSequenceDictionaryProgress]done: N=3630** parse bam 3.m6A site to genome sites 100%|?????????????????????????????????????????????????????| 3627/3630 [00:08<00:00, 406.47it/s]

gaoyubang commented 3 years ago

image Please check the ref bed file.

Derryxu commented 3 years ago

image Please check the ref bed file.

Thank you!

Derryxu commented 3 years ago

Sorry to bother you again.There is no genome file( anno.bed and anno.fa) in my data. Can it be mapping to the transcriptome reference for evaluation? What should I do if I can do that? Looking forward to your reply. Thank you very much! image

gaoyubang commented 3 years ago

It is feasible in principle.