elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.94k stars 24.74k forks source link

Simplify src directory structure #19611

Closed rjernst closed 8 years ago

rjernst commented 8 years ago

We currently follow the maven pattern of src/main/java, src/test/java, etc. Lucene, alternatively, removes one level of indirection, having src/java and src/test. I think it would be great to remove this extra directory level, as it makes working from a shell that much simpler (not everyone always works from an IDE). Since it would be hard to maintain an open PR for this, I thought I would first open an issue to get agreement. If we agree, I would do it on a weekend to minimize merge conflicts for open PRs (although in my experience moves will merge ok, as long as too much wasn't change in a branch from the original file).

Note that from the build side, the change is trivial (only needs to be changed in BuildPlugin).

So the directory changes would be: src/main/java -> src/java src/main/resources -> src/resources src/test/java -> src/test src/test/resources -> src/test-resources

dadoonet commented 8 years ago

I really like using defaults. Gradle uses same default values as Maven as explained in https://docs.gradle.org/current/userguide/java_plugin.html

I would not modify it. So I'm -1 on this.

rjernst commented 8 years ago

Why are using defaults more important than ease of use? This is an actual nuisance for those who navigate the directories from the command line. We need to have less deep directories structures in general (read: less java packages), but this is an easy change that could help with navigating to every single class in the source.

rjernst commented 8 years ago

And for those using IDEs, it will not change one thing for them, but for those using the shell, it helps.

nik9000 commented 8 years ago

I'm ambivalent. I don't think it'd save me any time and it is easier on folks who aren't always in the code base if we follow the "standard" directory lay out. But I don't see the slight change you are proposing to be a big deal either way. It is close enough to the normal structure that anyone used to maven will figure it out in seconds.

Where would you move src/test/resources? src/test-resources? I suspect I'd get to both the same way, str.

On Jul 26, 2016 6:55 PM, "Ryan Ernst" notifications@github.com wrote:

We currently follow the maven pattern of src/main/java, src/test/java, etc. Lucene, alternatively, removes one level of indirection, having src/java and src/test-java. I think it would be great to remove this extra directory level, as it makes working from a shell that much simpler (not everyone always works from an IDE). Since it would be hard to maintain an open PR for this, I thought I would first open an issue to get agreement. If we agree, I would do it on a weekend to minimize merge conflicts for open PRs (although in my experience moves will merge ok, as long as too much wasn't change in a branch from the original file).

Note that from the build side, the change is trivial (only needs to be changed in BuildPlugin).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/elastic/elasticsearch/issues/19611, or mute the thread https://github.com/notifications/unsubscribe-auth/AANLohsrgVCLLJSbn9oZw1_yBN9uVuGtks5qZpBlgaJpZM4JVrM3 .

rjernst commented 8 years ago

Where would you move src/test/resources? src/test-resources

Yes, that is what lucene does.

nik9000 commented 8 years ago

I don't think it'd help me in the shell because I just tab complete through all the directories. But it wouldn't hurt me either.

On Jul 26, 2016 9:26 PM, "Ryan Ernst" notifications@github.com wrote:

And for those using IDEs, it will not change one thing for them, but for those using the shell, it helps.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/elastic/elasticsearch/issues/19611#issuecomment-235455157, or mute the thread https://github.com/notifications/unsubscribe-auth/AANLog7-x5j5qeJ770brCuBiiOzIt591ks5qZrPBgaJpZM4JVrM3 .

rmuir commented 8 years ago

Yeah, i dont use your stupid shell, or your stupid IDE with its bogus tab completion.

Please, clean up the disorganized, overly complex filesystem hierarchy of thousands of undocumented needless packages and classes so the can be navigated.

This is literally :poop: code. Please, let's make it better.

rmuir commented 8 years ago

It is easy to see the crappiness of the organization mathematically:

Package depth in some cases (including src/main/java/....) gets to depths of 10 or 11.

The combination of these factors makes it nearly impossible to navigate this big pile of spaghetti. Lets remove needless levels of abstraction, please, that should be the number 1 priority for improving the elasticsearch code.

ywelsch commented 8 years ago

Why are using defaults more important than ease of use?

I don't buy the argument that navigating to src/test-java instead of src/test/java is going to be easier so I'm -1 on this. It's also going to be a pain to cherry-pick changes to 2.x. Let's just stick with the defaults here.

We need to have less deep directories structures in general (read: less java packages)

Having less Java packages has nothing to do with this change so don't mix the two concerns.

rjernst commented 8 years ago

I don't buy the argument that navigating to src/test-java instead of src/test/java

I misspoke when I said lucene uses src/test-java, it just uses src/test. And saying that you "don't buy the argument" is like saying "I don't believe you are actually hurting". Yet it is a fact that we have developers who use the command line to navigate these files, and they are the ones who support this change. Yet those here who use IDEs for everything and do not even realize where or how deep these file hierarchies go are balking at a change that will literally not affect them? This is a ridiculous stance.

It's also going to be a pain to cherry-pick changes to 2.x

2.x is dead, or rather it will be very soon. I didn't say we need to do this tomorrow. But certainly doing it before 5.x will produce the least amount of pain.

clintongormley commented 8 years ago

I'm ambivalent about the change, although I agree with @ywelsch's comments about difficulties in cherry picking. One thing is for sure though, I'm not interested in these dramatic bullying comments and will -1 this change based on that alone.

Calm discussion, or close the issue.

javanna commented 8 years ago

Seems like the "too many java packages" problem is the one responsible for too much typing, which should not be confused with this issue. I agree that defaults don't have to be followed all the time, but I am not sure how much typing this change would help saving. Plus I don't see it as a debate between "everybody who uses the IDE" vs "everybody who uses the shell". Most of us use both IDE and shell and what I personally see being annoying on the shell is the number and the depth of java packages, which is a different problem.

rmuir commented 8 years ago

Its not "bullying" to clearly point out the codebase is crap, when it is.

It is simply a critical statement: its important we accept that the codebase is a big disorganized mess!

@clintongormley I am sorry if you want to live in a fantasyland where people only make positive comments, but that is not going to happen. Say hello to your reality check.

The problem is, the packages and the layout go hand in hand, and both contribute to making abnormally long file names:

The packaging starts with the abnormally long org.elasticsearch which could easy be org.elastic or something better, but it doesnt end there. It sometimes goes to hierarchies 6 or 7 deep, which is totally unnecessary. More than 99% of these hierarchies are undocumented.

That is a disorganized mess, and it makes super long file names which cause problems with e.g. copy-paste, overflowing terminals, all kinds of stuff. Tab completion or fancy IDEs does not fix the problem.

So yea, I'm sorry to have to be the one to say, that this is crap, but its crap. I will repeatedly emphasize this because it is the reality. It is not bullying. It is perfectly acceptable to voice a negative opinion: I will make sure you that recognize this. I reject your world of groupthink where everyone says that the situation is positive. My negative criticism will be amplified to 10x now that you have tried to push back against it.

Prepare yourself.

rjernst commented 8 years ago

Seems like the "too many java packages" problem is the one responsible for too much typing

Every bit of typing matters. I pointed out the "too many java packages" as an aside to reinforce this is an easy change that could help with navigating to every single class in the source

but I am not sure how much typing this change would help saving

It saves 5 characters, for every single class. Note in Robert's comments before, he does not use tab completion. But even with tab completion, this saves on an extra 2 characters (j-<TAB> once you are under src/test.

Plus I don't see it as a debate between "everybody who uses the IDE" vs "everybody who uses the shell". Most of us use both IDE and shell and what I personally see being annoying on the shell is the number and the depth of java packages, which is a different problem.

Sure, everybody uses the shell, but not for the same purpose. And it's not a different problem: depth of directory structure is the problem. Again, this proposal would help with every single class.

The cherry picking argument is valid. But we successfully moved the entire src for 2.x, and it did not halt all bugfixes back to 1.7.x. We also have very few patches going back to 2.x even now (and I would bet fewer once 5.0 is out).

I opened this discussion now because we are very close to branching for 5.0. A major version branching is really the only time to do large source moves (again, like we did in 2.0), so if we are going to make this change, it needs to be soon. I would ask that those of you who have -1'd to think if this is actually something that would hurt your development, or if it is worth blocking for those that do feel it would help.

clintongormley commented 8 years ago

@rmuir I'm in favour of the change. I'm not in favour of the aggressive style. It's completely unwarranted. This is a place for discussion, not for childish outbursts.

dadoonet commented 8 years ago

I think this thread has split in two separate concerns:

So I'm going to try to add my comments on both but separated.

Move to a simpler directory tree

I'm a strong believer in convention over configuration approach. My experience in the past is that a lot of developers have been educated using de-facto standard.

People now know that on maven if you want to build a project, you run mvn install and that's all. People are starting to know the same commands for Gradle as well as it's more and more adopted.

I'm not saying that this organization of dirs is good or not. That's not my point. The point is that 90% of java devs are using today src/main/java and src/test/java in their projects. I know some projects where they are using ant and they adopted the exact same organization as it will be easier then to migrate to Maven or Gradle.

Gradle also adopted those defaults.

That's why I'm -1 on this change until the convention changes.

Too many packages

I do agree about this. I wrote more than a year ago in a private issue that we should split the core in modules and make every module more isolated, testable on its own... So it would drastically reduce the number of packages per module. Ideally one module should contain a very limited number of packages.

Although I agree on this, I don't think that we should fix the lack of modularity we have today with a "false fix". Changing the default source dirs will may be mask for a short time the problem but won't fix that.

mikemccand commented 8 years ago

On Wed, Jul 27, 2016 at 8:06 AM, Clinton Gormley notifications@github.com wrote:

@rmuir https://github.com/rmuir I'm in favour of the change. I'm not in favour of the aggressive style. It's completely unwarranted. This is a place for discussion, not for childish outbursts.

+1 to keep the discussion/arguments technical.

And big +1 for this change.

Mike McCandless

imotov commented 8 years ago

My negative criticism will be amplified to 10x now that you have tried to push back against it.

Prepare yourself.

I think it's an important change. However, it might also have unforeseen negative side-effects, and because of it I would like to be able to discuss this change in a calm, respectful manner, without threats of retaliation for sharing one's opinion to make sure we didn't miss something. I think, it should be perfectly acceptable to voice a negative opinion and not been attacked for your choice of IDE in return.

I agree with @clintongormley, I simply cannot +1 any change that was discussed like this. I don't think any change should make it to elasticsearch after this sort of deliberation. So, I am -1 on it until we have a civilized discussion.

This is literally💩 communication. Please, let's make it better.

clintongormley commented 8 years ago

To pick this up again: I am in favour of this change because I think it is the first step in making our code base easier to navigate. Shorter is better. I don't agree with the "convention is better" argument - I find it easier to find stuff in Lucene, which doesn't follow the conventions, than I do in Elasticsearch. These changes make things more obvious, not less.

Yes, there is more that needs to happen: flattening out packages, reducing abstractions etc. But those things can be done bit by bit. This directory rename is a big change that affects the whole code base, and now is a good time to do it before we create the 5.0 branch.

I'm not worried about the cherry picking aspect - cherry picking should (theoretically anyway) work across renames. Also, the master branch has diverged so far from 2.4 that we seldom cherry pick directly anyway. The number of changes which go back to 2.4 is pretty limited these days.

So, there are advantages to making this change (perhaps they won't affect you directly) and the disadvantages seem pretty minor. Anybody who has a strong reason why we shouldn't do this?

nik9000 commented 8 years ago

I'd be in favor of getting cloture at this point and just having an up or down vote on the original issue. The arguments are fairly clear on both sides. Maybe we can give everyone until Monday night to vote? Any objections?

imotov commented 8 years ago

So, just to summarize:

Pros:

Cons:

Did I miss something?

mikemccand commented 8 years ago

Just to explain my +1 a bit:

I love shorter paths. Some of us type these paths many, many times per day. These paths also get folded into docs, URLs, READMEs, whatnot. Random devs clone ES and have to poke through them.

Less typing, easier to read, shorter URLs, etc. Choosing longer paths means you choose to add this unnecessary taxation to your life, much like if your shower and dressing routing consumes 2 hours each morning.

Shorter paths means less fat and conveys that we value being "lean", that we value being "different" (non-standard) because it is leaner. I think conveying that we choose to be lean and fast as a core part of our design ethic sends an important message.

And "it's a standard" is a poor defense imo: the world would never improve if all everyone ever did was follow the standards. Remember George Bernard Shaw's quote: "The reasonable person adapts themselves to the conditions that surround them, while the unreasonable person adapts surrounding conditions to them. Therefore, all progress depends on the unreasonable person."

I also dislike the "one person is being a 'bully' so now I veto this issue" approach: that just amplifies/echo-chambers the effect of one 'bully' and results in a fragile/big-company culture going forward. You should rather do the opposite (minimize the impact) by calling out the 'bullying', and then disregarding it and getting back on point (technical arguments) instead: that's what heals a community, makes it naturally resilient to 'bullying'.

Painful cherry-picking is a one-time cost, vs an expected loooong future benefits of less ongoing taxation. Saying this matters is much like arguing Lucene can't make a cool change since it would hurt backwards compatibility, and that used to hold back a great many good Lucene changes but guess who changed that broken culture in Lucene a few years back :)

ywelsch commented 8 years ago

@mikemccand Nobody argues here that breaking backwards compatibility is always bad or that we should strictly follow standards and not try to be lean. There is no need to get philosophical about this. I agree wholeheartedly that decisions should be made on technical arguments, not bullying / counter-bullying. I just don't think that "4 less characters to type" vs "non-standard directory layout / painful cherry-picking" is the right tradeoff to do in this case. I value the standard layout for its internal and external consistency. Regarding external consistency, the argument is simply that most Java projects use exactly this layout. Regarding internal consistency, look at

- src
  - java
  - resources
  - test
  - test-resources

vs

- src
  - main
    - java
    - resources
  - test
    - java
    - resources

The former feels inconsistent in the naming (why is java and test on same directory level? Does that mean that test is not only Java code)? The standard layout is clean and consistent.

rjernst commented 8 years ago

I just don't think that "4 less characters to type" vs "non-standard directory layout / painful cherry-picking" is the right tradeoff to do in this case.

How many characters reduction makes breaking from "standards" worthwhile? 7? 10? Minimum 25? The argument that 4 characters (which by the way, it is 5) less typing isn't meaningful is subjective. To some, it is meaningful. Why continue to cause pain for those developers so that a similar few will be "confused" about the layout for all of 4 seconds (subjective guess!) when navigating the source tree (which most people are going to do from the IDE anyways)?

nik9000 commented 8 years ago

To me the typing savings is super small. As is the cost in confusion for new developers.

The columns-on-the-screen savings is more compelling to me but not so compelling that I'd bother making the change personally. I just don't suffer much.

I don't think the cherry-pick cost is going to be super high because the files are just moved. If you have to use patch you'll need to jump around some extra, and, yeah, that is a bit of a pain but I think ok.

Honestly I voted to do it because it'll make a few folks very happy at fairly low cost. I think the cost is low for the reasons I typed above.

dakrone commented 8 years ago

Tangentially, if we wanted to remove/save more typing, we could buy es.co as a domain and then replace all the instances of org.elasticsearch//org/elasticsearch with co.es for package names :)

colings86 commented 8 years ago

What is the plan for eclipse with this change? At the moment we have core/main and core/test as separate projects to resolve the fact that Eclipse doesn't separate test and compile/runtime dependencies within a project (at least I think that was the reason). With all four folders being at the same level after this change whats the plan for recreating this separation in Eclipse?

rjernst commented 8 years ago

@colings86 It's workable. We already call the gradle file eclipse-build.gradle. We can have eclipse-java-build.gradle and eclipse-test-build.gradle in src, point the root at the java root for each, and then change the src dir for java to . and for resources to ../whatever.

colings86 commented 8 years ago

IF we decide to go ahead with this change can we please ensure that the above approach works in Eclipse before we merge the change (if we haven't already tested). Eclipse rarely handles relative paths that reference files out of the project well and its important that this change does not adversely affect Eclipse users as, while it is not used as much as IntelliJ in this community, it is still used by a number of this communities members. I am happy to help test that eclipse works.

rjernst commented 8 years ago

can we please ensure that the above approach works in Eclipse before we merge the change

Of course. :)

mikemccand commented 8 years ago

Honestly I voted to do it because it'll make a few folks very happy at fairly low cost.

Thanks @nik9000: that sort of comment (as opposed to e.g. immediate vetoing) is the anti-venom to becoming a big company.

It's really hard to make good changes once a team is too large, because there are too many chances to for people to point out all the reasons NOT to do something. "Design by committee" sets in: https://www.youtube.com/watch?v=Bjj_94ECUKU

So seeing you turn the logic around here is refreshing, thank you.

imotov commented 8 years ago

@mikemccand Sorry, could you clarify your comment a bit? Because it reads like voicing your approval for the side that bully is on is the only acceptable behavior and is the anti-venom to becoming a big company? Rising your concerns against this position is unacceptable because it leads to "Design by committee", rising your concerns about the tone of the discussion is also unacceptable - because you are supposed to just ignore it. What should we do when the bully is against the proposed change?

imotov commented 8 years ago

@dakrone maybe it wasn't a good idea to mention a domain by its name before we acquired it :) Oh well, we always have co/elastic to fall back to.

pickypg commented 8 years ago

I'm definitely not a fan of leaving behind the standard conventions just to save a few keystrokes that can be solved with local, .gitignored aliases.

I'm happy to have project names shortened or moved, but I see moving to src/java and src/test as far more confusing because of things like src/main/resources and src/test/resources (as @ywelsch pointed out, it's workable with dashes). As we add our own little source sets, this makes things more convoluted and makes it harder to get ramped up.

At the end of the day though, we really don't add that much in terms of subprojects and those that do get added are generally cloned from existing ones. So the only "cost" is on new hires learning our not-quite conventional system. As someone that's not new to the project, it won't be a big deal to me either way; I just don't prefer it. So let's try it?

mikemccand commented 8 years ago

Because it reads like voicing your approval for the side that bully is on is the only acceptable behavior

Sorry, that was not my intention: voicing approval for @nik9000's refreshing logic is not the same thing as approving bullying! They are not related.

Big entities (companies, communities) become ineffective over time because it is so much easier for people to say no to a change than yes, so as you add people to an entity, more and more "nos" will come out against changes and fewer and fewer big changes can happen. The company or community becomes a victim of its own success.

Whereas, if that community were made up of a bunch of @nik9000 s instead, this would not happen, because of his refreshing logic ("it will make some people very happy and there's not much harm, let's do it") in saying OK to a big change.

I also saw this happening back when @rjernst was working so hard to get the gradle build working: person after person kept raising their hand and giving reasons why we should not do this, putting up barriers that "so and so must be done before we swithc", and nobody was doing what Nik just did here (thank you @rjernst for persisting anyway!).

You should certainly say "no" to changes if you have your reasons, but I really love how and why Nik said yes here: that's rare and special.

imotov commented 8 years ago

@mikemccand thank you very much for the explanation. I now fully understand what you were saying. It makes perfect sense. Sorry, for bugging your earlier. I +1 the issue.

rjernst commented 8 years ago

Can anyone who cares about this please update their vote using thumbs up/thumbs down reaction on the original issue comment here? As @clintongormley mentioned, if we are going to do this, it should be soonish (next week?), so I would like to figure out if it is worth my time to test out the necessary eclipse changes.

nik9000 commented 8 years ago

on the original issue comment here

Do you mean on the description or on @dadoonet's first comment?

Just so we're super clear, a 👍 means you vote for reorganization and an 👎 means you vote against the reorganization.

jprante commented 8 years ago

Just my 2 cents: here is a good article https://blog.codinghorror.com/code-smells/ for defining some heuristics, in the hope you can reduce/collapse some classes and packages, esp. helpful might be the "Code Smells Between Classes" section.

rjernst commented 8 years ago

Ok, there are currently 8 votes for, and 4 votes against. I would like to make this change this coming weekend 8/20. I will confirm this week the eclipse changes will work. Please let me know if there is something I missed or something that requires holding off.

jasontedor commented 8 years ago

Please let me know if there is something I missed or something that requires holding off.

Leading with the most important item first, I have a very large changeset in flight (touching multiple hundreds of files) that is already tedious to maintain (I have to inspect all changes, even if no merge conflicts). No matter the outcome of this proposal, please do not make it at this time.

Yet it is a fact that we have developers who use the command line to navigate these files, and they are the ones who support this change.

I use the command line extensively and I do not support this change. I think my workflow will be negatively impacted by it many times daily. I have very strong muscle memory for src/main/java as every codebase that I spend regular time in save one (Lucene) follows the convention. I can tab-complete through src/main/java/ extremely rapidly except I always trip up in the Lucene codebase.

Lastly, it's very unclear to me whether a change this contentious should be resolved by vote, especially when (but not limited to) clear procedures for recording and tallying votes have not been established. As an example (among a few), should the :+1:s on @dadoonet's comment count as votes against (probably, but it's not been established).

mikemccand commented 8 years ago

Lastly, it's very unclear to me whether a change this contentious should be resolved by vote,

Actually, this is exactly the kind of change that needs to be / should be resolved by vote, or, by veto: if you feel so strongly that this change must not go through because of your muscle memory, you should upgrade your comments to a veto.

I do agree we should make the procedure very clear: ES cannot scale as a distributed team unless we have a clear process for making contentious decisions. Otherwise all we will be able to do is make uncontentious decisions.

In this case: put your vote on @rjernst original opening comment. People who care strongly enough about an issue should go and vote on it, otherwise "lazy consensus" applies.

rjernst commented 8 years ago

Given the current voting (about even) and the timeframe (5.0 is getting closer to release), I am closing this issue. It's not worthwhile to me to cause disruption right now (or potentially break the build so close to release with such a large move), but I do hope we can be open in the future to large changes based on their merit.