Open cent0134 opened 4 months ago
Hi @cent0134,
The warning is normal. You can try running xpore
without the --genome
, --gtf_or_gff
, and --transcript_fasta
flags:
xpore dataprep --eventalign reads-ref.eventalign.txt --out_dir dataprep
Thanks!
Best wishes, Yuk Kei
HI, @yuukiiwa ,Thank you for taking the time to answer my question, but the key issue is that the file I output is empty. Or do you mean that the problem of outputting empty files can also be solved by removing these flags? Looking forward to your reply again.
Hello I kept receiving error messages and empty output files (data_idex data.json、data.log、data.readcount eventalign.index)。 After I finish running
xpore dataprep --eventalign reads-ref.eventalign.txt --gtf_or_gff NL4-3.gtf --transcript_fasta NL4-3.fa --out_dir dataprep --genome_
,an error warning appeared,This is the complete error repot;xpore dataprep --eventalign reads-ref.eventalign.txt --gtf_or_gff NL4-3.gtf --transcript_fasta NL4-3.fa --out_dir dataprep --genome /home/dell/miniconda3/envs/xpore/lib/python3.12/site-packages/xpore/scripts/dataprep.py:72: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy chunk_split['line_length'] = np.array(lines)
These are my ref fasta, gtf ,and eventalign file。
gtf;AF003887 GenBank transcript 790 2292 . + . transcript_id "gag_t01"; gene_id "gag" AF003887 GenBank exon 790 2292 . + . transcript_id "gag_t01"; gene_id "gag"; AF003887 GenBank CDS 790 2292 . + 0 transcript_id "gag_t01"; gene_id "gag"; AF003887 GenBank transcript 2358 5096 . + . transcript_id "pol_t01"; gene_id "pol" AF003887 GenBank exon 2358 5096 . + . transcript_id "pol_t01"; gene_id "pol"; AF003887 GenBank CDS 2358 5096 . + 0 transcript_id "pol_t01"; gene_id "pol"; AF003887 GenBank transcript 5041 5619 . + . transcript_id "vif_t01"; gene_id "vif" AF003887 GenBank exon 5041 5619 . + . transcript_id "vif_t01"; gene_id "vif"; AF003887 GenBank CDS 5041 5619 . + 0 transcript_id "vif_t01"; gene_id "vif"; AF003887 GenBank transcript 5559 5849 . + . transcript_id "vpr_t01"; gene_id "vpr" AF003887 GenBank exon 5559 5849 . + . transcript_id "vpr_t01"; gene_id "vpr"; AF003887 GenBank CDS 5559 5849 . + 0 transcript_id "vpr_t01"; gene_id "vpr"; AF003887 GenBank transcript 5830 8489 . + . transcript_id "tat_t01"; gene_id "tat" AF003887 GenBank exon 5830 6044 . + . transcript_id "tat_t01"; gene_id "tat"; AF003887 GenBank exon 8399 8489 . + . transcript_id "tat_t01"; gene_id "tat"; AF003887 GenBank CDS 5830 6044 . + 0 transcript_id "tat_t01"; gene_id "tat"; AF003887 GenBank CDS 8399 8489 . + 1 transcript_id "tat_t01"; gene_id "tat"; AF003887 GenBank transcript 5969 8673 . + . transcript_id "rev_t01"; gene_id "rev" AF003887 GenBank exon 5969 6044 . + . transcript_id "rev_t01"; gene_id "rev"; AF003887 GenBank exon 8399 8673 . + . transcript_id "rev_t01"; gene_id "rev"; AF003887 GenBank CDS 5969 6044 . + 0 transcript_id "rev_t01"; gene_id "rev"; AF003887 GenBank CDS 8399 8673 . + 2 transcript_id "rev_t01"; gene_id "rev"; AF003887 GenBank transcript 6224 8815 . + . transcript_id "env_t01"; gene_id "env" AF003887 GenBank exon 6224 8815 . + . transcript_id "env_t01"; gene_id "env"; AF003887 GenBank CDS 6224 8815 . + 0 transcript_id "env_t01"; gene_id "env"; AF003887 GenBank transcript 8817 9437 . + . transcript_id "nef_t01"; gene_id "nef" AF003887 GenBank exon 8817 9437 . + . transcript_id "nef_t01"; gene_id "nef"; AF003887 GenBank CDS 8817 9437 . + 0 transcript_id "nef_t01"; gene_id "nef";
ref fasta;
eventalign file; contig position reference_kmer read_index strand event_index event_level_mean event_stdv event_length model_kmer model_mean model_stdv standardized_level AF003887 1 GGAAG 0 t 1389 100.39 2.906 0.00564 GGAAG 115.76 5.56 -2.37 AF003887 1 GGAAG 0 t 1390 120.60 7.913 0.02191 GGAAG 115.76 5.56 0.75 AF003887 2 GAAGG 0 t 1391 110.99 6.816 0.00266 GAAGG 105.26 4.06 1.21 AF003887 3 AAGGG 0 t 1392 116.69 7.518 0.02324 AAGGG 113.12 7.84 0.39 AF003887 4 AGGGC 0 t 1393 111.36 5.661 0.01195 AGGGC 116.40 4.05 -1.07 AF003887 5 GGGCT 0 t 1394 105.70 3.247 0.00432 GGGCT 113.28 5.31 -1.23 AF003887 6 GGCTA 0 t 1395 113.84 4.058 0.00830 GGCTA 110.69 3.55 0.76 AF003887 7 GCTAA 0 t 1396 88.25 2.053 0.00498 GCTAA 84.40 2.63 1.26 AF003887 7 GCTAA 0 t 1397 84.09 1.143 0.00299 GCTAA 84.40 2.63 -0.10 AF003887 8 CTAAT 0 t 1398 95.87 4.933 0.00730 CTAAT 96.70 3.04 -0.23 AF003887 8 CTAAT 0 t 1399 104.79 1.832 0.00365 CTAAT 96.70 3.04 2.28
Can you please help me solve this problem? Thank you very much。