sillsdevarchive / languageforge-mongo-2022

(archived) language forge experimental code with mongo (false start)
MIT License
0 stars 0 forks source link

List and filter entries #7

Open hahn-kev opened 1 year ago

hahn-kev commented 1 year ago

should include some filtering capabilities

myieye commented 1 year ago

Just for the record, code for querying using the .NET MongoDB Driver, could look something like this, but none of it is supported by the driver:

Strange archived code ``` public async Task> FindEntries(string projectCode, string? filter, string? inputSystem, string? partOfSpeech, string? semanticDomain, int? skip, int? take) { var filterBuilder = new FilterDefinitionBuilder(); var FilterDefinition = filterBuilder.Empty; if (!string.IsNullOrWhiteSpace(filter)) { FilterDefinition &= filterBuilder.Or( filterBuilder.AnyStringIn(e => e.Lexeme.Values.Select(v => v.Value), filter), filterBuilder.AnyStringIn(e => e.CitationForm.Values.Select(v => v.Value), filter), filterBuilder.AnyStringIn(e => e.Senses.SelectMany(s => s.Definition.Values.Select(v => v.Value)), filter), filterBuilder.AnyStringIn(e => e.Senses.SelectMany(s => s.Gloss.Values.Select(v => v.Value)), filter) ); // Probably better?: filterBuilder.Text(filter); } if (!string.IsNullOrWhiteSpace(partOfSpeech)) { var partsOfSpeech = await _optionsService.FindPartsOfSpeechKeys(projectCode, partOfSpeech); if (partsOfSpeech.IsNullOrEmpty()) { return new List(); } FilterDefinition &= filterBuilder.Or( partsOfSpeech.Select(pos => filterBuilder.AnyStringIn(e => e.Senses.Select(s => s.PartOfSpeech.Value), pos))); } if (!string.IsNullOrWhiteSpace(semanticDomain)) { FilterDefinition &= filterBuilder.AnyStringIn(e => e.Senses.SelectMany(s => s.SemanticDomain.Values), semanticDomain); } var entries = await _projectDbContext.Entries(projectCode) .Find(FilterDefinition) .Skip(skip) .Limit(take) .ToListAsync(); return entries.Select(e => new EntryDto { Lexeme = FilterInputSystems(e.Lexeme, inputSystem), CitationForm = FilterInputSystems(e.CitationForm, inputSystem), Sense = e.Senses.Select(sense => { return new SenseDto { Gloss = FilterInputSystems(sense.Gloss, inputSystem), Definition = FilterInputSystems(sense.Definition, inputSystem), PartOfSpeech = sense.PartOfSpeech?.Value, SemanticDomains = sense.SemanticDomain?.Values, }; }).ToList(), }).ToList(); } private static Dictionary FilterInputSystems(Dictionary inputSystemValues, string? inputSystem) { return inputSystemValues? .Where(kvp => string.IsNullOrWhiteSpace(inputSystem) || kvp.Key == inputSystem) .ToDictionary(kvp => kvp.Key, kvp => kvp.Value.Value) ?? new Dictionary(); } ```
myieye commented 1 year ago

We also need to decide officially on the format of fields. I'm primarily thinking of input-system dependent fields like this: image

In the database they're represented as dictionaries of objects with a single Value property:

"lexeme": {
    "oc": {
      "value": "Value"
    },
    "en": {
      "value": "Value-2"
    }
}

Do we have reason to believe that these fields will gain additional properties? I.e. is it future safe for the API to just "unwrap" these to:

"lexeme": {
    "oc": "Value",
    "en": "Value-2"
}

Also, there was already some discussion regarding whether the API keeps these as dictionaries or formats them as lists instead:

"lexeme": [
   { "inputSystem": "oc", "value": "Value" },
   { "inputSystem": "en", "value": "Value-2" },
]

The pros and cons I can think of: Dictionary:

List:

Edit: In summary there are 3 options:

longrunningprocess commented 1 year ago

I personally prefer the "Dictionary":

"lexeme": {
    "oc": "Value",
    "en": "Value-2"
}

As far as the order, imo, that belongs to the consumer. The backend could simply present them in order of creation.

myieye commented 1 year ago

I did a bit of digging to confirm that (1) Writing/Input systems have a defined order in Flex projects and (2) The order can even be configured at the field level, though I'm not quite sure what this is about - my suspicion is that it's actually only applicable to exporting/publishing, so not necessarily, currently relevant in our current case.

So, I don't think the order belongs to the consumer. There is ultimately a configured order that we'll need to make available to the consumer somehow.

(1) image

(2) image

longrunningprocess commented 1 year ago

yes, in that case I wouldn't change my preference however I would maintain the order preferences/configuration in a separate location or data object, not as part of the entry, not by some means inherent to the data for the entry.

hahn-kev commented 1 year ago

I do like the dictionary. But it is very restrictive.

With a list of objects we can just add additional properties without breaking anything. Maybe that won't happen today, but the tradeoffs don't seem worth it.

I agree I think the consumer is responsible for the order. That could mean that the order is stored out of band in some configuration for the project.

megahirt commented 1 year ago

There is a related discussion about metadata and structure over at https://github.com/sillsdev/web-languageforge/issues/1147

myieye commented 1 year ago

As discussed in our team meeting today, we'll go with dictionaries of objects:

"lexeme": {
    "oc": {
      "value": "Value"
    },
    "en": {
      "value": "Value-2"
    }
}

We're ultimately splitting hairs here, because we figured that both are future proof. But here are some things we talked about: