katerinakazantseva / strainy

Graph-based assembly phasing
Other
43 stars 4 forks source link

gfapy merging fails occasionally #38

Closed mikolmogorov closed 1 year ago

mikolmogorov commented 1 year ago

gfa merging function occasionally fails, which seems to be related to sequences with length zero. Ensuring that such sequences are never created (e.g. substitute null sequence with a single nucleotide) should resolve this. Example of the error log:

Traceback (most recent call last):
  File "/home/mkolmogo/projects/metaPhase/metaphase.py", line 87, in <module>
    main()
  File "/home/mkolmogo/projects/metaPhase/metaphase.py", line 81, in main
    sys.exit(transform_main())
  File "/home/mkolmogo/projects/metaPhase/metaphase/transform.py", line 853, in transform_main
    gfapy.GraphOperations.merge_linear_paths(initial_graph)
  File "/home/mkolmogo/miniconda3/lib/python3.7/site-packages/gfapy/graph_operations/linear_paths.py", line 151, in merge_linear_paths
    enable_tracking=enable_tracking)
  File "/home/mkolmogo/miniconda3/lib/python3.7/site-packages/gfapy/graph_operations/linear_paths.py", line 88, in merge_linear_path
    enable_tracking=enable_tracking)
  File "/home/mkolmogo/miniconda3/lib/python3.7/site-packages/gfapy/graph_operations/linear_paths.py", line 320, in __create_merged_segment
    merged_name=merged_name)
  File "/home/mkolmogo/miniconda3/lib/python3.7/site-packages/gfapy/graph_operations/linear_paths.py", line 231, in _add_segment_to_merged
    merged.sequence.append(s)
AttributeError: 'Placeholder' object has no attribute 'append'
katerinakazantseva commented 1 year ago

add workaround def clean_g(g)-Remove 0len unitigs, virtual and self links In future we won't create zero units and self link, but virtual links will remain: '''The order of the lines in GFA is not prescribed. Therefore, during parsing, or constructing a Gfa in memory, it is possible that a line is referenced to, before it is added to the Gfa instance. Whenever this happens, Gfapy creates a “virtual” line instance.'''