Closed zhengluo-lz closed 4 months ago
Hello,
Can you run head -n 1000 yourfile.gfa
and send me the result? Also, can you share the actual error you get?
Thanks.
test.zip
Sure, this is the gfa file created by vg, but when I run this command pantera.R -g test.gfa.1 -o test_output
, the error message is below
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
line 5330 did not have 3 elements
Calls: get_segments -> read.table -> scan
Execution halted
Thanks, I uploaded a new version fixing that bug. But take into account that in the test file you sent pantera will not run anyway due to the polymorphic segments being too small. You can reduce the minimum size (-s)
Implemented temporal fix for fread bug. Added a check for minimum number of polymorphic segments.
Thank you, I will try it again.
Why do I still get the same error when I run the new script?
I can confirm the fix works in Linux. What system are you using?
I use the Linux systerm, are you sure you can run the new script on the GFA file provided by me?
Yes, that is why I uploaded the fix. Do you get exactly the same error? Can you confirm you are running the new one, just in case.
Yes, I downloaded the script you just uploaded but still get the same error.
If you run echo -e '@@A\tB\tC\tD\tE\tF';head test.gfa
Is this what you get?
@@A B C D E F
H VN:Z:1.1
S 1 TAAACCCTAAACCCTAAACCCTAAACCCTAAA
S 2 CCCTAAACCCTAAACCCTAAACCCTAAACCCT
S 3 AAAACCCTAAACCCTAAACCCTAAAACCCTAA
S 4 ACCCTAAACCCTAAACCCTAAACCCTAAACCC
S 5 TAAACCCTAAACCCTAAACCCTAAACCCTAAA
S 6 CCCTAAACCCTAAACCCTAAACCCTAAACCCT
S 7 AAACCCTAAACCCTAAACCCTAAACCCTAAAC
S 8 CCTAAACCCTAAACCCTAAAACCCTAAACCCT
S 9 AAACCCTAAACCCTAAACCCTAAACCCTAAAC
Yes, it's the same as the one you provided.
In case it is related. In your comment you say you run test.gfa.1
, but the file you sent me is test.gfa
. Are you sure we talk about the same file? I can confirm the one you shared does not return an error on this version of pantera running in Linux.
Oh, I see, I got the wrong file, but now there's a new error.
Error in nchar(seq) :
cannot coerce type 'closure' to vector of type 'character'
Calls: [ -> [.data.table -> eval -> eval -> nchar -> nchar
In addition: Warning messages:
1: File '/tmp/RtmpBChtKw/file73a4d663968e6' has size 0. Returning a NULL data.table.
2: File '/tmp/RtmpBChtKw/file73a4d64c3ae24' has size 0. Returning a NULL data.table.
Execution halted
Please, share the new gfa file, if it is not too large, and the pantera.log of that run.
test.file.zip This is the new gfa file, Thanks.
Also, which options did you use? As I mentioned. I don't think that gfa is a good representation of a pangenome of the variation graph type, as all segments have the same size (32). Pantera will not work on correctly on that gfa. Can you share how it was generated?
Maybe it is due to it finding some temporal files on the folder of an aborted run. Can you confirm that trying to use a different output folder? If that is the problem I will add a check to confirm the output folder does not exists.
Here is the VCF and FASTA file I used.I generated the GFA file using the following command.
vg autoindex --prefix test --workflow giraffe --ref-fasta test.fa --vcf test.vcf.gz
vg convert --gfa-out --gbwtgraph-algorithm --no-wline test.giraffe.gbz > test.gfa
vg convert -fW --gfa-in test.gfa > test.new.gfa
test.gfa and test.new.gfa are different versions of the GFA file.
After changing the output folder name, there were no errors, but there were also no results. Is this because the GFA format created by vg has issues?
Thanks. I will upload a fix requiring the exit folder to not exist to avoid issues. Regarding the vg file. I was reading about how it was formed and it seems to me that that gfa is mostly used and a way to pass information between tools, but that is has not been prepared to collapse the paths into common segments. I would suggest you use the results of either pggb or minigraph directly.
Sure, thank you. This is an excellent software for identifying transposons, especially for those working in maize genomics. I will recommend your software to more researchers who are working on maize genomics and transposon-related studies.
谢谢你!
Uploaded fix to require output folder and prevent errors after failed runs.
Hi Pio,
Could you supply a small set of test files created by vg for pantera? I get an error when using the GFA format created by vg as input.