JuliaWeb / Gumbo.jl

Julia wrapper around Google's gumbo C library for parsing HTML
Other
154 stars 25 forks source link

Fix print attributes #67

Closed stev47 closed 5 years ago

stev47 commented 5 years ago
codecov-io commented 5 years ago

Codecov Report

Merging #67 into master will not change coverage. The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master      #67   +/-   ##
=======================================
  Coverage   68.42%   68.42%           
=======================================
  Files           5        5           
  Lines         114      114           
=======================================
  Hits           78       78           
  Misses         36       36
Impacted Files Coverage Δ
src/io.jl 45.71% <100%> (ø) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update cc7864e...085c522. Read the comment docs.

porterjamesj commented 5 years ago

Thanks!

I'll admit I don't love the sorting. It adds a potentially confusing discrepancy between how a document is stored in memory and how it gets printed for display, as well as a performance cost. Is there any spec-motivated or other compelling reason to include the sorting?

The spacing fix looks great though! I'm happy to merge if you can remove the sorting.

Thanks for working on the printing code, it's definitely a bit of a mess and has many edge cases where it doesn't work well, doing an overhaul of it is on my todo list the next time I have substantial time and energy to work on this project.

stev47 commented 5 years ago

The attributes are stored in a Dict and therefore the order of attributes in the original document is not tracked. For the testsuite to be robust the output should be consistent and not rely on the (unspecified and arbitrary) iteration order of the Dict. Sorting makes this consistent and adds almost no overhead since the number of attributes is usually small and the IO time will be magnitudes larger.

If you still disagree, I can remove that commit.

porterjamesj commented 5 years ago

Hmm the point about test consistency is a good one, I'm also sold by the fact that BeautifulSoup does the same thing.