apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.69k stars 1.04k forks source link

Remove FieldType copy constructor [LUCENE-4126] #5198

Open asfimport opened 12 years ago

asfimport commented 12 years ago

Currently FieldTypes can be created using new FieldType(someOtherFieldType) which copies the properties and allows them to then changed. This reduces readability since it hides what properties someOtherFieldType has enabled. We should encourage users (and ourselves) to explicitly state what properties are enabled so to prevent any surprises.


Migrated from LUCENE-4126 by Chris Male, updated May 09 2016

asfimport commented 12 years ago

Robert Muir (@rmuir) (migrated from JIRA)

Lets play out a few typical realistic scenarios that are not too expert:

  1. user has a text field and wants to enable term vectors (so they can use highlighting/MLT)
  2. user has a string field and wants to enable norms (so they can use index-time boosting)

what does the before/after picture look like here? Is it easier? Is it trappy?

asfimport commented 12 years ago

Chris Male (migrated from JIRA)

Good question.

Currently specifying your own FieldType means you have to use Field rather than StringField or TextField as neither of them accept a FieldType. This is messy and basically the same problem that #5173 is fixing for storing. Hmm..

In relation to the the copy constructor issue, for scenario #1 currently users could do:

FieldType myNewFieldType = new FieldType(TextField.TYPE_STORED);
myNewFieldType.setStoreTermVectors(true);

With the copy constructor removed, they would need to do:

FieldType myNewFieldType = new FieldType();
myNewFieldType.setIndexed(...);
myNewFieldType.setStored(...);
... // set other properties
myNewFieldType.setStoreTermVectors(true);

In the current case the user can easily rely on the pre-existing type and just change the property they're interested in. In their code it would be clear what was changed since no other properties need to be set. At the same time any changes to the pre-existing type would flow into their type without them being notified and they cannot scan over their code and see exactly what properties are set for a field, they'd have to look up the definition.

With the copy constructor removed, we make changing a property more of a task for the user since they would need to define all the properties themselves. Yet at the same time they would be protected from any changes to pre-existing types and they could see in their code exactly what properties were set. But it also wouldn't be so easily to see which property was specifying changed.

I'm not really sure what's best, what do you think?

asfimport commented 12 years ago

Robert Muir (@rmuir) (migrated from JIRA)

Currently specifying your own FieldType means you have to use Field rather than StringField or TextField as neither of them accept a FieldType. This is messy and basically the same problem that #5173 is fixing for storing. Hmm..

Actually i think this is ok: these are still expertish things but just not totally crazy.

I dont understand the benefit removing this: having someone create a FieldType from scratch is crazy. Its way too ridiculous: too easy to forget to set tokenized to true or whatever. Creating a FieldType from scratch is pretty much only useful for committers or people extending things in super-expert ways.

So I think its clear whats best: we have to keep lucene useable.

asfimport commented 12 years ago

Chris Male (migrated from JIRA)

I can agree with that

asfimport commented 12 years ago

Robert Muir (@rmuir) (migrated from JIRA)

Also i think today, anyone that wants to do things the way you describe can just create the FieldType from scratch already?

they can do this and set everything from scratch, add the field twice, whatever they want :)

But if we remove the ability to do simpler things like 'i want a TextField with term vectors enabled' or 'I want a StringField with index-time boosts', then I think thats a big loss to less advanced users, with no gain to the experts who can already do things from scratch anyway if they prefer to do that.

asfimport commented 12 years ago

Chris M. Hostetter (@hossman) (migrated from JIRA)

bulk cleanup of 4.0-ALPHA / 4.0 Jira versioning. all bulk edited issues have hoss20120711-bulk-40-change in a comment

asfimport commented 12 years ago

Robert Muir (@rmuir) (migrated from JIRA)

rmuir20120906-bulk-40-change

asfimport commented 11 years ago

Steven Rowe (@sarowe) (migrated from JIRA)

Bulk move 4.4 issues to 4.5 and 5.0

asfimport commented 10 years ago

Uwe Schindler (@uschindler) (migrated from JIRA)

Move issue to Lucene 4.9.