Background (Regex Parser)

godismyjudge95 commented 10 years ago

Problems:

For some reason the css parser won't parse any characters other than letters. For example, properties like: content-box and no-repeat break this code.
Default value for any longhand property is "initial"
When converting back from longhand to shorthand, the new changes are not captured and saved (even though I used the same code that redmunds used). Instead, the "initial" value is saved for all values.
There is an extra line at the end when converting from shorthand to longhand.

redmunds commented 10 years ago

For some reason the css parser won't parse any characters other than letters. For example, properties like: content-box and no-repeat break this code.

Notice how your JS code uses backgroundColor for the CSS background-color property? That's because "-" in JS is the minus operator (so not valid in an identifier), so they're converted to camelCase. Likewise, you may need to convert values with hyphens to camelCase (but that's just a guess).

redmunds commented 10 years ago

Default value for any longhand property is "initial"

Some of the default to "initial" and some seem to be "". Looks like you'll have to map that to actual default value.

redmunds commented 10 years ago

When converting back from longhand to shorthand, the new changes are not captured and saved (even though I used the same code that redmunds used). Instead, the "initial" value is saved for all values.

You need to use Chrome Developer Tools to walk through the code. Let me know if you don't know how to do that. For example, I started with background: blue, then added background-image: url("my.png"). Instead of being at decl.value.value[0].value (string) it's at decl.value.value[0].value.e.URL.value with type of quotes specified in decl.value.value[0].value.e.URL.quote, and it will be different if "url()" is not specified.

So, I think you'll need to walk through the tree passed back by the css parser on a case-by-case basis.

redmunds commented 10 years ago

There is an extra line at the end when converting from shorthand to longhand.

Take a look in InlineShorthandEditor._done and you'll see where I remove the \n from the end. It only works if \n is last char so put a breakpoint to see why that code doesn't work for you. Maybe there's an extra space. Maybe you're on Mac and it's \r.

godismyjudge95 commented 10 years ago

Ok, so I fixed all of the typos.

Notice how your JS code uses backgroundColor for the CSS background-color property? That's because "-" in JS is the minus operator (so not valid in an identifier), so they're converted to camelCase. Likewise, you may need to convert values with hyphens to camelCase (but that's just a guess).

I was not talking about my jquery code. Either way (with or without a hyphen) will work with that. What I was talking about is when the css that needs to be converted contains any other character than the regular letters the parser throws a fit and I need a completely different piece of code to read the values. (like you suggested later) Maybe that is what you are talking about. Either way, I do not know why it is doing that or what to do to solve that problem.

Some of the default to "initial" and some seem to be "". Looks like you'll have to map that to actual default value.

This is not really a problem, I just did not go through the hassle of putting in a default for each value yet, and I am not sure there is a universal default for all browsers.

Take a look in InlineShorthandEditor._done and you'll see where I remove the \n from the end. It only works if \n is last char so put a breakpoint to see why that code doesn't work for you. Maybe there's an extra space. Maybe you're on Mac and it's \r.

I will take a look there and see if that is why. I am not working on a Mac so I do not know what the breakpoint is on them.

You need to use Chrome Developer Tools to walk through the code. Let me know if you don't know how to do that. For example, I started with background: blue, then added background-image: url("my.png"). Instead of being at decl.value.value[0].value (string) it's at decl.value.value[0].value.e.URL.value with type of quotes specified in decl.value.value[0].value.e.URL.quote, and it will be different if "url()" is not specified.

So, I think you'll need to walk through the tree passed back by the css parser on a case-by-case basis.

Instead of having to figure out on a case by case basis of where each values are being stored, would it not be easier and more efficient to just capture whatever the values of the property that we are converting and pass them to jquery (like I am already doing) and have jquery (not JSLint) do all the work? Or do we absolutely need a css parser? If we just use jquery like I am currently, we could write one big function to handle all the work of expanding the code and another for compressing it.

Let me know what you think.

redmunds commented 10 years ago

I am not working on a Mac so I do not know what the breakpoint is on them.

By "breakpoint", I meant to set a breakpoint in the debugger so you could look at the actual data. Let me know if you don't know how to do that and I will show you -- it will make you life much easier.

Instead of having to figure out on a case by case basis of where each values are being stored...

When I said "case-by-case", I meant for each type. I figured out where strings and url() values are stored, but there probably aren't that many more types to figure out.

would it not be easier and more efficient to just capture whatever the values of the property that we are converting and pass them to jquery (like I am already doing) and have jquery (not JSLint) do all the work?

I still not convinced that the jQuery approach will work. For example, specify something like background-position: 20% 15em. I think jQuery will try to convert those value to raw numbers, so you'll lose the units.

For now, if you want the text then do this:

    var text = ShorthandManager.unparseDeclarationList(declList);

We can switch to just passing text if you like, but I think it will need to be parsed at some point.

godismyjudge95 commented 10 years ago

By "breakpoint", I meant to set a breakpoint in the debugger so you could look at the actual data. Let me know if you don't know how to do that and I will show you -- it will make you life much easier.

Oh no I know what that is. I was talking about the \r thing. I do not know if the \r thing is the issue as I am not on a Mac.

When I said "case-by-case", I meant for each type. I figured out where strings and url() values are stored, but there probably aren't that many more types to figure out.

But other properties that we want to implement would have to be troubleshooted the same way. Anytime the parser creates a new format for storing the info.

I still not convinced that the jQuery approach will work. For example, specify something like background-position: 20% 15em. I think jQuery will try to convert those value to raw numbers, so you'll lose the units.

As far as I can tell jQuery does not change the form of the numbers (but it does change the word center to 50% 50%). I will start working on a prototype to show you what I am talking about. This new one should work for most properties and would require little to no configuration for new ones.

godismyjudge95 commented 10 years ago

For now, if you want the text then do this:

var text = ShorthandManager.unparseDeclarationList(declList);

Also, for some reason this code does not work when I put it in the convert to longhand function it does not work. I tried substitution the variable too, but it does not work.

redmunds commented 10 years ago

I am for whatever works best for everyone. It's easy to convert params from parsed format to strings.

If we decide to stick with the css parser, then another option is to create some functions such as getPropName and getValue() so we don't have to use value.value[0].value (and even worse) everywhere.

godismyjudge95 commented 10 years ago

Ok, so here is the update. Ideally I would not have to reverse engineer the parser and retrieve the values the way I did, but this was just for a proof of concept. The code could definitely be a lot cleaner and more simple. Because I wanted the values just held in a string and not object form I had to add in some code that broke support for the trbl provider. Go ahead and try it out. Everything works (as far as I can tell) except for the fact that the background position values (e.g. left center) are converted to percentages (e.g. 0% 50%). Oh, and the when specifying a background image the parser messes up the code, but if we remove the parser altogether this will be fixed. Please let me know what you think, but I believe going the jquery route will make expansion easier and the code less complicated and more efficient.

Thank you for your help with this.

redmunds commented 10 years ago

So, it looks like you do still want to parse the raw text, but put it in a simpler format:

    declList = code.split(": ")[1].split(";")[0];

This should be done with a regex.

I did some testing and could only get a few cases to work. Below are the problems I saw. We should be sure that these can be fixed before changing the parsing.

When I set background-image: url("my.png"); the value was dropped. Note that I was careful to specify the full path to an image that actually exists on disk in case that was the problem, but it could be that image is not ofund due to base url being different from current page.
As you noted, setting background-position: center center; converts value to 50% 50%. Even worse, shorthand editor will not re-open with saved value!
I set background-size: contain; and value was dropped.
background-repeat and background-attachment were only properties that worked for me
I tested background-origin and background-clip together since they take the same values. When using inherit, editor will not re-open. If I only set 1 of the 2 values, then re-opening editor assigns the 1 value to both.
background-color only seems to work with named values such as blue. I got it to work once with a hex value (e.g. #ce9 or cc3399) and it was converted to rgb() and editor would-not re-open, then it wouldn't work again with hex values. I could not get it work with rgb() or hsl() formats. Note to test different color formats, put cursor in color and hit Ctrl-E to get inline color editor.

Let me know if you want to continue down this path and I will fix the parsing code in this branch.

redmunds commented 10 years ago

After playing around with this provider, I realize that it would be very helpful to have code hints inside the shorthand editor, so I opened issue #9. I think this should be easy to implement.

godismyjudge95 commented 10 years ago

I definitely agree that code hints will definitely help. Also, most of the problems, I believe, right now are caused by the parser. So, I think if we dismantle that most of the url and color problems will be fixed. I could be wrong but most of the disappearing values happen when the text is parsed into longhand and then at the end when it is unparsed into shorthand. It would be great if you could help me remove the parsing code on this branch. That way we can see for sure how this might work. If it ends up being more complicated, we can just start over from the main branch and use the parser.

Also, about the code hints, how easy is that to implement?

redmunds commented 10 years ago

I submitted pull request #10 to simplify parsing.

The code hinting problem needs to be fixed in Brackets code.

godismyjudge95 commented 10 years ago

Ok, so the code hinting is unrelated to this extension? Also, thank you so much for your help with changing the code from parsing to regex. I think I will now be able to figure out the rest of the problems pretty easily.

godismyjudge95 commented 10 years ago

Um... so I was looking at your code for this, and I am not quite sure why you went down the route that you did. The code seems a lot more complicated than what I was doing. Plus the conversion from shorthand to longhand seems to be messed up. Maybe you could explain what you changed so that I can better understand how to fix bugs as I find them?

Thanks for you help.

redmunds commented 10 years ago

@LeinardoSmith Which exact code?

FYI, you can switch to "Files Changed" tab and insert comments at the exact line of code that you want to comment on.

Otherwise, I'm on the freenode #brackets IRC channel if you want to chat.

redmunds commented 10 years ago

I guess I missed you. BTW, I'm in Pacific Time zone.

I replaced some of your string handling code with array manipulations that I think are simpler and more reliable. I can explain them to you.

I also added the convertProps() function. It takes a declList (array of prop/val pairs) and an array of "to" props, and returns an array of vals (that map to the array of "to" props). This converts either way: shorthand ==> longhand, or longhand ==> shorthand. It's really the same code -- just the "from" and "to" params are reversed.

Other than that, I'm not sure what you think is more complicated.

godismyjudge95 commented 10 years ago

Yeah, I think I missed you now. I am in Eastern Time zone. Thanks for the explanation of what you did. I was just having trouble understanding what the code was doing. I am still very much in a learning phase with javascript. I cannot seem to get values that have hyphens in them to work. For example, no-repeat seems to make all of the values blank. I think it might have something to do with how we are passing the values to jquery in the $test.css command.

Let me know if there is anything else I can help with. I am going to continue looking for issues.

redmunds commented 10 years ago

@LeinardoSmith I saw that you tried to reach me on IRC the other day -- did you resolve your issue?

godismyjudge95 commented 10 years ago

@redmunds yeah I must have missed you, but I have not figured out the solution. I was wondering how you would suggest to distinguish between the values of the bg position and bg size, as they are measured in the same units. Here is what I have so far: http://regex101.com/r/sN7fG7

I have regex for most of the other values. http://regex101.com/r/fZ9sE8

http://regex101.com/r/sN7fG7

http://regex101.com/r/eB1kM1

http://regex101.com/r/rB8xQ8

http://regex101.com/r/rF2zW7

Let me know what you think.

redmunds commented 10 years ago

As I started to think about this, I realized that the shorthand extension answers the question! Start with any expanded background and set it to:

background-image: initial;
background-position: 1px 2px;
background-size: 3px 4px;
background-repeat: initial;
background-attachment: initial;
background-origin: initial;
background-clip: initial;
background-color: initial;

Click Done and you have:

background: 1px 2px / 3px 4px;

The values are separated by a slash so there is no guessing involved.

In the spec it's designated as <position> [ / <bg-size> ]?. This means that first numbers are position. If position is specified, then size can be specified following a slash.

godismyjudge95 commented 10 years ago

Ok, good news! I think I finally have a few regex's that will either find a working value or return null. I have tested these with every possible case that I can think of and used the W3C spec to create them, so I think I have my bases covered. Let me know what you think and if you have any suggestions. Otherwise I think we can go ahead and put these into their appropriate place.

image = string.match(/(url\(.*\))/)[1];
position = string.match(/\s*((?:\s*(?:left|right|top|bottom|center|\d+(?:\%|in|cm|mm|em|rem|ex|pt|pc|px))){1,4})/i)[1];
size = string.match(/\/\s*((?:\s*(?:left|right|top|bottom|center|\d+(?:\%|in|cm|mm|em|rem|ex|pt|pc|px))){1,2})/i)[1];
repeat = string.match(/\s*((?:\s*(?:repeat-x|repeat-y|repeat|space|round|no-repeat)){1,2})(?!\:)/i)[1];
attachment = string.match(/\s*((?:\s*(?:fixed|local|scroll)){1,2})/i)[1];
origin = string.match(/\s*(?:\s*(border-box|padding-box|content-box)){1}/i)[1];
clip = string.match(/\s*(?:\s*(border-box|padding-box|content-box)){1,2}/i)[1];
color = string.match(/\s*(rgba?\(.*\)|hsla?\(.*\)|\#[a-f0-9]{3}(?:[a-f0-9]{3})?|inherit|transparent|currentColor|aliceblue|antiquewhite|aqua|aquamarine|azure|beige|bisque|black|blanchedalmond|blue|blueviolet|brown|burlywood|cadetblue|chartreuse|chocolate|coral|cornflowerblue|cornsilk|crimson|cyan|darkblue|darkcyan|darkgoldenrod|darkgray|darkgreen|darkgrey|darkkhaki|darkmagenta|darkolivegreen|darkorange|darkorchid|darkred|darksalmon|darkseagreen|darkslateblue|darkslategray|darkslategrey|darkturquoise|darkviolet|deeppink|deepskyblue|dimgray|dimgrey|dodgerblue|firebrick|floralwhite|forestgreen|fuchsia|gainsboro|ghostwhite|gold|goldenrod|gray|green|greenyellow|grey|honeydew|hotpink|indianred|indigo|ivory|khaki|lavender|lavenderblush|lawngreen|lemonchiffon|lightblue|lightcoral|lightcyan|lightgoldenrodyellow|lightgray|lightgreen|lightgrey|lightpink|lightsalmon|lightseagreen|lightskyblue|lightslategray|lightslategrey|lightsteelblue|lightyellow|lime|limegreen|linen|magenta|maroon|mediumaquamarine|mediumblue|mediumorchid|mediumpurple|mediumseagreen|mediumslateblue|mediumspringgreen|mediumturquoise|mediumvioletred|midnightblue|mintcream|mistyrose|moccasin|navajowhite|navy|oldlace|olive|olivedrab|orange|orangered|orchid|palegoldenrod|palegreen|paleturquoise|palevioletred|papayawhip|peachpuff|peru|pink|plum|powderblue|purple|red|rosybrown|royalblue|saddlebrown|salmon|sandybrown|seagreen|seashell|sienna|silver|skyblue|slateblue|slategray|slategrey|snow|springgreen|steelblue|tan|teal|thistle|tomato|turquoise|violet|wheat|white|whitesmoke|yellow|yellowgreen)/i)[1];

redmunds commented 10 years ago

These look pretty good.

One problem with the /(url$.*$)/ regex is the .* defaults to finding the maximum length string, so it won't stop when it hits the first ). Use .*? to specify using the minimum length string: /(url$.*?$)/. This change also improves performance on long lines of text. I also see this used in some of the other regexes.

I don't think that left|right|top|bottom|center are valid for size, only position.

What's the difference between the regex's for clip and box? I think the valid values are the same.

You can use the Brackets ColorUtils.COLOR_REGEX regex.

godismyjudge95 commented 10 years ago

Right on the URL part and the BG size. As for the clip and box. I needed to differentiate between the two when both are present, so they are slightly different. Also I am not sure how you would implement the color thing with brackets as this is regex.

Otherwise I think we are ready to implement it in the code.

Thanks for the help.

redmunds commented 10 years ago

That is also a regex. They're basically the same, so no need to duplicate it: https://github.com/adobe/brackets/blob/master/src/utils/ColorUtils.js#L40

godismyjudge95 commented 10 years ago

Isn't that more complicated than what I have here? Plus it does not seem to cover all of the css3 supported colors. I copied the colors in my regex directly from the W3C's website.

redmunds commented 10 years ago

That's OK. I just wanted to make you aware of it.

godismyjudge95 commented 10 years ago

So where should I put these in? Should they go in the provider or the parse function or somewhere else?

redmunds commented 10 years ago

I would first get it working in the provider, then you can move anything that might be useful to some other shorthand property to ShorthandManager.

godismyjudge95 commented 10 years ago

OK I will work on that this weekend.

godismyjudge95 commented 10 years ago

I went ahead and implemented it, but it is a little sloppy. Also instead of using the jquery I figured this does basically the same thing except it does not panic when bad values are passed. I think this may be the best way to go completely. That would fix the other problems with the URLs and stuff. I have not implemented this however in the longhand to shorthand conversion yet. That is still using jquery.

redmunds commented 10 years ago

OK, I looked at your code. I was thinking of a slightly different approach that would make the regex's a little simpler.

Use a loop:

Test each regex (nested loop) until you get a match on the first value in the list. To do this, add ^ to the beginning of each regex so it only matches beginning of the line. This would allow you to use same regex for both clip and box (that I asked about earlier), but you'll need to keep track of the order they are found.
If no match, exit loop.
Then use match[0].length to remove that value from list
Go back to top of loop

I guess the downside of my approach is that it would have to deal with (or fail on) any errant tokens, where your approach ignores them.

One kind of "errant token" to deal with are comments. It would be relatively easy to recognize and keep them, but maintaining the position of comment in value list would be a pain.

redmunds commented 10 years ago

I played around with it and it seems to work pretty good (except for the known problems).

One other case is with multiple backgrounds:

    background: no-repeat rgb(204, 204, 204), white repeat-x content-box clip, blue repeat-y content-box;

This is a non-sensical value that I made up to illustrate the problem, but the way the code is, it always edits the first one. It doesn't make sense to try to expand and load them all, but it should only edit the one where the cursor is. But, you can list this as a known limitation for now.

Another case that I don't think the code handles is if values are on multiple lines:

    background: no-repeat rgb(204, 204, 204),
                white repeat-x content-box clip,
                blue repeat-y content-box;

godismyjudge95 commented 10 years ago

Ok... not sure how the loop thing would work, but I think we definitely do need to ignore values that are not up to w3c code. I think that we could get the code to work with comma separated multiple backgrounds. Also, if we change over the longhand to shorthand code to regex, I think most of the known problems should be fixed.

godismyjudge95 commented 10 years ago

Ok, so I did not work on the loop code yet, but I did switch over the longhand to shorthand conversion to regex. I know it can be simpler but I just wanted to get it up and running. Most of the known issues should now be fixed. All that is left is to simplify the code. I think even all of the problems you mentioned should now be fixed! :D If this works out well, I think the other providers will be much simpler to add. Let me know what you think.

redmunds commented 10 years ago

I'll be out for the next week, but I'll take a look when I get back.

redmunds commented 10 years ago

I played around with it and it seems to work pretty good, although I honestly can't remember all of the previous problems I found.

In addition to this regex problem I mentioned, I also notice that the extension doesn't work at all in Sprint 38 (which hasn't been released yet), which updated CodeMirror to v4, so I can research that after Sprint 38 is released.

godismyjudge95 commented 10 years ago

Ok... so did you go ahead and change the .* to .*? I am trying out as many different situations that I can and so far that is the only thing that does not work. Hopefully we will not have to do too much to get it working again in Sprint 38. By the way, I am also working a font property provider since that is similar to this one.

redmunds commented 10 years ago

did you go ahead and change the .* to .*?

I changed it locally to verify that fixed the problem, but I did not submit a pull request. I figured it was a simple enough fix for you to make.

godismyjudge95 commented 10 years ago

Yeah that is fine.

godismyjudge95 commented 10 years ago

Just finished adding support for the font property. Need to start bug fixing, although it should be pretty stable already.

redmunds commented 10 years ago

That's cool, but I, personally, would wait until the background provider code is working and merge this branch into master. Then added the font provider in a separate branch/pull request. This is helpful when someone is looking for (or asks for) for an example of adding a provider, then each pull request is an atomic example.

godismyjudge95 commented 10 years ago

Oh ok... I will move the changes to another branch and add another pull request.

godismyjudge95 commented 10 years ago

So can we go ahead and merge this with the main branch? I have not come across any more problems. Plus the font support will not work without this being merged upstream (since this is the branch we changed the parser to regex). There is probably still a lot of unused code, because we were trying out so many things. I, however, do not know what all is being used by the trbl provider. Let me know what you think needs to be done next.

redmunds commented 10 years ago

I did a quick code review and found 1 function that can be removed (unless you think you may use it for something).

I re-tested the TRBL provider and it looks good.

godismyjudge95 commented 10 years ago

Sounds fine to me. Let me know if you find anything else.

redmunds commented 10 years ago

@LeinardoSmith Are you waiting for me on something to merge this?

godismyjudge95 commented 10 years ago

@redmunds Not that I know of... is this ready to merge?

redmunds commented 10 years ago

I believe that the convertProps() function is no longer used. Please verify, but if so, you can remove it. Otherwise, merge it!

godismyjudge95 / shorthand

Background (Regex Parser) #8