yeraydiazdiaz / lunr.py

A Python implementation of Lunr.js 🌖
http://lunr.readthedocs.io
MIT License
187 stars 16 forks source link

"Out of order word insertion" in lunr.js due to differences in string comparison between Python and JavaScript? #144

Open abdnh opened 8 months ago

abdnh commented 8 months ago

I'm using lunr.py to create a search index for use with lunr.js, and I hit an issue that I believe is due to differences in how Python and JavaScript compare strings.

One example to reproduce the issue:

"🔥" < "\uf0ae" 

This evaluates to false in Python but true in JavaScript.

What's the best way to handle this?

yeraydiazdiaz commented 5 months ago

Can you provide a more complete example the interaction between the indices from lunr.py and lunr.js?

abdnh commented 5 months ago

Here's an example to reproduce the error.

  1. Create a search index using lunr.py:
    
    import json

from lunr import lunr

documents = [ { "id": "a", "body": "🔥", }, { "id": "b", "body": "\uf0ae", }, ]

idx = lunr(ref="id", fields=("body",), documents=documents) with open("index.json", "w", encoding="utf-8") as file: json.dump(idx.serialize(), file)


2. Load it into lunr.js (Example in node.js):
```javascript
const lunr = require("lunr");

async function loadIndex() {
    const serializedIndex = require("./index.json");
    lunr.Index.load(serializedIndex);
}

loadIndex();