Closed marcoarment closed 10 years ago
It could also provide a potential path for a library to take that wants to use hack but still support php. I'd like to make my libraries that I maintain use hack for static typing, but I still need them available as composer packages for non-Hack projects. I'd rather not maintain multiple versions of the library so being able to convert a Hack file back to "standard" PHP would be super useful.
When we discussed giving a tool to strip types, we realized that Hack is more than just a type system. At the type level, it would be trivial to erase the types and run the code on pure-PHP engines. Dealing with everything else is much harder (xhp, containers, lambda, etc.), at which point it just makes more sense to use HHVM. For now, based on our understanding of the pros and cons, we decided not to offer such a tool.
@alokmenghrajani Fair enough. I think you may be underestimating the demand for type annotations among people who won't otherwise use the other Hack features enough to be worth tying themselves to a very young language and VM.
PHP is a very old, proven, conservative, familiar, ubiquitous choice — Hack and HHVM are none of those. Because of those qualities, PHP appeals to many people for whom full-blown Hack will be considered merely an interesting experiment for a while, not something suitable for deploying for important apps. But deploying important apps is exactly where type annotations are in huge demand.
Since writing such a tool (for type-annotations-only conversion) isn't a big undertaking, and would still benefit a lot of people (I suspect), I wouldn't be surprised to see third-party or in-house options crop up for this. (I'd almost certainly write one, and I use HHVM already — just not Hack, and I'd like to keep my options open since they're both so young. Today, while I use HHVM, I can switch over to PHP in minutes if necessary. If I start writing anything in Hack, I can't.)
In other words, while "Hack is more than just a type system", as you said, I think realistically, there are probably a lot of people who really just want its type system.
If tools like this end up happening anyway, I think the community and developers would be better served having an officially supported parse tree, at least, being output by HHVM for this purpose, rather than trying to reimplement it (and keep up with changes) poorly.
If you simply implement --parse
with structured output (JSON?), it would be an easy way to empower and improve experiments like this without needing to have any further or ongoing involvement.
Agreed. Right now, I only see hack as being useful in the last mile. Underlying libraries can't take advantage of hack without duplicating logic in hack that already exists in PHP.
A build-time utility to convert Hack to PHP (or the other way around? perhaps using docblock annotations?) would do this and as @marcoarment said, this utility or the parsing behavior to make it easily possible, would go a long way to making this a non-issue.
Yeah, I think the case for a translation tool is even clearer for library and framework authors.
I completely understand that deployed frameworks can't just wholesale convert over to Hack and require HHVM -- they'd lose a significant fraction of their users overnight if they did this :) However, all is not totally lost, since a Hack project using, say, Symfony (which is and will presumably remain vanilla PHP for a while) will work just fine. Hack inter-operates seamlessly with PHP -- you just are responsible for your own typechecking at the border. So it does work, at least, right now.
But of course that isn't a great solution -- we can and should do better, ideally without requiring the entire world to convert over to Hack. Strong typechecking at said border is often important and while it technically "works" it is pretty far from ideal. But I think that a conversion tool as described in this thread isn't the right way to go about it.
hhi
files, just like we do for the standard library. This is doable right now if you really want, but I wouldn't suggest it either. It's way to easy for the hhi
files to get out of sync with reality, since there is currently no mechanism to ensure that they stay in sync. Hack doesn't know anything about PHP code, so it has no way to check the hhi
consistency with the actually implementation.So that's where we're at right now. But as I said, I think this is worth improving. While a type erasure tool isn't the way to go, I think there is a possible workable approach in the last option, with the hhi
files. I think we could build a way to check hhi
consistency against upstream PHP code, allowing contributors to still contribute vanilla PHP on PHP-5, and a few folks who care about Hack to easily, mechanically check the consistency of the hhi
files before release. We might even be able to provide a way to bootstrap the initial hhi
files to help folks get started.
None of this is set in stone -- we still need to flesh out the details. Maybe it won't end up working and we'll do something else. But we are thinking about this!
And of course, if you think any part of my analysis above is incorrect, please let me know. I did do a considerable amount of research on this in months past, but parts of it are also based on intuition, so please feel free to correct that intuition if it is incorrect!
I agree with @marcoarment.
This is same problem as with Dart Language. It provides superb tools to create type-safe client side code, but since it's Dart2Js produces unreadable JavaScript it's all or nothing if you want to start using it.
If Hack to PHP transpiler is created, it should be prioritize for doing as few changes as possible. It would work as a fallback in case Hack is not developed or becomes unusable in future.
I think our difference in views on this comes down to our different priorities and environments.
You, the Hack creators and contributors, are fully invested in the world of Hack. You wanted to extend a new language on top of PHP, you achieved that, and you're fully invested in writing code exclusively in Hack that runs exclusively in HHVM. And Hack isn't very new to you.
To the PHP community, Hack is a brand new thing. For the first time, the HHVM project — which has benefited us tremendously as an alternative to the PHP interpreter — has added some very useful new language features, but without an escape hatch. If we adopt any of them, there's no going back: we're locked into HHVM forever, everywhere our code needs to run.
In short, Hack removes you from the PHP community and forces you — and any developers adopting Hack — to build your own ecosystem, starting from nearly zero. You're no longer writing a language that can be used and deployed easily everywhere (and often already is deployed).
A tool that can strip type annotations to convert rudimentary Hack usage to PHP would allow people interested in Hack to keep one foot in both camps. It lets us use Hack the same way we've been using HHVM: as a tool to make PHP better. It's a smaller ask, and I bet you'll get a lot more people using Hack if you give them this huge increase in practicality and peace-of-mind.
Without such a tool, you're asking everyone to leave the comfort and practicality of PHP and devote themselves 100% to a brand new language and platform. Honestly, if I'm going to bear those costs, I'd have to consider whether it's more worthwhile to migrate my code and skills to something gaining more mainstream steam, like Node or Go, or a long-established competitor with a huge community and robust tools like Python or Ruby.
People will use Hack outside of Facebook regardless, but I bet it'll get much larger support, and much more quickly, by letting PHP programmers safely dip their toe in with an easy way out.
I see a great reason to support compiling to native PHP from Hack; client requirements. If you are writing code for your own consumption, using Hack or PHP is an internal business choice. However I have clients that have software restrictions. For them to change their supported languages to include a new language can take literally years, if it were ever to happen. (Think banks, publicly traded companies enforcing self-imposed SOX regulations, or companies that resist change). Having a way to make Hack code portable to native PHP code would be fantastic, even if there were caveats to making it work properly.
For this reason alone, Hack is a non-option for me. Or for my clients. I can play with Hack all I want on my personal time, but until something like this is put in place the odds of me ever being able to make a living using Hack are low without changing my industry focus.
Hey everyone. Sorry we haven't been able to respond to this yet, we're all really busy getting ready for the Hack developer day, and this issue deserves more careful consideration than we've had time to do yet. But we are listening and have been reading this issue since the beginning! Sorry, and hope we can figure something out soon.
Sounds great - thanks @jwatzman
Sorry it's taken so long to write up a response to this. Things have been happening in the background, though.
For the first time, the HHVM project [...] has added some very useful new language features, but without an escape hatch. If we adopt any of them, there's no going back: we're locked into HHVM forever, everywhere our code needs to run. In short, Hack removes you from the PHP community [...]
This is a concern that we totally understand. Dropping all of the existing PHP ecosystem is something that's not viable for a lot of folks, for good reason. Improving interoperability without a wholesale switch is something we want to do. Just a backwards conversion tool, for the reasons I outlined above (as well as the pure technical reasons I just outlined at https://github.com/HackConv/HackConv/issues/1#issuecomment-41348540), we don't think is the right approach.
But what is the right approach? One small step is to start letting the typechecker check "Hack-aware PHP" code: https://github.com/facebook/hhvm/commit/771e8af71200cd763dcbcf4dbc01205944669977. This came out of a discussion with the Composer, Symfony, and Heroku projects. The idea is that projects who don't want to wholesale convert to Hack but who do want to be "Hack-aware" or "Hack-friendly" can start letting the typechecker see into their code. They still keep writing vanilla PHP, but we extract what information we can from their type signatures, and yell if they do anything that is not valid Hack syntax. So they end up using a subset of both languages, getting a little bit of the safety of Hack's type system while sticking with 100% PHP.
That helps Hack project end users a lot (their interface with the library is now typed), but doesn't help the library author themselves too much (only basic decl-mode checks are being done). How do we help the library authors even more, to give them the power of Hack and its type system without making them convert, and without a janky conversion tool that no one wants to use anyways? We have some ideas here, and this is definitely a space to explore. If you guys have ideas, I'd love to hear them. But as I said, I'm pretty convinced a Hack => PHP conversion tool isn't it.
Does this address your concerns -- or at least convince you that we properly understand them and are thinking hard about how to address them? We don't have a good answer right now, but we definitely realize that we need one in order to help bring the goodness of Hack to more folks :)
If you guys have ideas, I'd love to hear them.
What about a PSR-5-esque type annotation parser for this <?php //decl
mode? This way code that uses things like @param string $foo
could still parse as PHP but when run through Hack get type support as well.
What you have looks promising though - this could just be a way to extend it to be more so.
I don't know enough about Hack (or creating languages, for that matter) to suggest anything more, but I think you're on the right track, and I'm thankful that you're considering these factors.
Also, I like @nubs' idea of being able to parse types from doc comments. I know it's a little gross to compiler/semantic purists, but it would be immensely practical.
And isn't "gross to purists but immensely practical" the entire story of PHP?
I think this thread needs a cat analogy. All kittens are cats but not all cats are kittens. PHP is a cat and Hack is a kitten and a cat. You should not try to turn a cat into a kitten, ever.
If you have a reason to write in Hack because it serves a better business process then that code is Hack. Hack can read PHP but it should not be asked to create it. I will not write a library in Hack which would just as well serve as PHP. But if my library leveraged the features of Hack beyond blanket type casting then it would be a Hack library forever. If I made a mistake and created a Hack library which could have been written in PHP and is needed in PHP then it should be rewritten but not by --parse
|\ _,,,--,,_
/,`.-'`' ._ \-;;,_
|,4- ) )_ .;.( `'-'
'---''(_/._)-'(_\_)
What about a PSR-5-esque type annotation parser for this <?php //decl mode? This way code that uses things like
@param string $foo
could still parse as PHP but when run through Hack get type support as well.
We considered this too and have discarded it for the time being. The problem is that docblock annotations aren't actually enforced anywhere by the runtime. This quickly leads to lots of your annotations being lies, even if the typechecker is verifying them.
Facebook actually ran into this in a pretty bad way: until very recently (basically right before the Hack open source launch), HHVM didn't enforce return type annotations at all. The Hack typechecker could still enforce their consistency with each other, but their consistency with reality wasn't actually enforced anywhere. When we added return type checking to HHVM, we found out that we couldn't actually turn it on in our codebase -- too many of the return types were completely wrong! Not just one or two, and not just "slightly wrong" -- but lots and lots of annotations, and many of them things that, upon simple inspection, were so completely wrong it was unclear why they had been written in the first place. The typechecker wasn't catching them because, due to the gradually typed nature of partial mode, it's easy to lose type information at which point we fall back to "assume the programmer knows what they are doing". So they were self-consistent, just not consistent with reality. We just last week got this under control enough to turn on logging for it, not even hard failures yet. It's a painful cleanup, and I'm happy that we launched HHVM 3.0 with hard failures on return types enabled by default so no one else has to go through it.
So if we parse out of docblocks, we run the risk of inflicting this exact issue onto everyone, since there is nothing grounding those annotations in reality. It's really easy for them to diverge, even if you have the Hack typechecker helping you out (which, since this functionality doesn't exist, no one writing vanilla PHP does right now). At best, they'll discover some lying docblock annotations in a few years when they try to reifiy their docblock annotations as part of a full-blown Hack conversion. At worst, they'll get frustrated at the typechecker when it keeps complaining about non-bugs, or failing to catch real ones, due to the ungrounded docblock annotations. Again, we've been there, and it's a pretty horrible place to be and to get yourself out of; we don't want anyone else to have to go through it as much as we can help them avoid it.
This is why I like just reading the existing annotations in <?php // decl
-- those are enforced by the runtime, in the same way, whether your runtime is HHVM or not. So there's no chance for divergence. And if PHP decides to pick up some of our extended type annotation syntax -- there are already some RFCs for it, though of course it's anyone's guess whether they'll go through or what form they will end up taking -- we will start picking that up automatically too. But everything will remain enforced by whatever runtime you use and thus grounded in reality.
I can only speak for my own code base, but I consider it a good thing if inconsistencies are brought to the front during development. In normal PHP we do 3 things with docblocks... 1) Our own static analyser parses them and runs analysis for all calls and operations 2) We've used register_tick_function to perform quick and dirty tests on normal PHP, that the arguments to the current function match the docblock signature of the function 3) We've compiled out PHP with docblocks to Hack files, and used Hack's verifier
So it's certainly possible for programmers who aren't lazy-asses to use good tools during development to efficiently correct errors in their codebases ;-).
I would love to be able to run HHVM and get efficient run-time type checking working during development, while maintaining full PHP compat. It would catch cases our static analyser misses, and not involve messy code transformations or inefficient background processing.
I would love to be able to run HHVM and get efficient run-time type checking working during development, while maintaining full PHP compat.
Agree with this. It's something we definitely want to support a lot better than we do now -- just reading docblock comments is not IMO the way to do it.
I think the discussion here has run its course; I don't think there is anything directly actionable in this task beyond the general "figure out a way to fix this and then fix it", and so I'm going to close it. Feel free to let us know if you have any ideas on way to better interoperate with PHP code that doesn't run into the pitfalls I outlined above.
For anyone still watching this thread, you may be interested in https://code.facebook.com/posts/398235553660954/announcing-the-hack-transpiler/ -- we more or less changed our minds on this :)
Woo! Now I can justify looking into Hack seriously. Thanks.
Re "at which point it just makes more sense to use HHVM", you've missed the point fully. I can write in PHP and it can run everywhere. I can write in Hack and it can't run anywhere.
Tldr: AD 3000 ain't the time. Neither is facebook the space. And thus only if it can be unbuggily converted to PHP would Hack make sense. It's that simple.
You've all done a great job with the Hack language, but as it's so new, I'm hesitant to start writing all of my code in Hack. But I'd love to start using Hack's type annotations in otherwise pure-PHP code. I bet I'm not the only one.
This would be a way for lots of PHP programmers to dip their toes into Hack while still having an easy exit if it doesn't work out, short- or long-term: we could gain most of the benefits of type annotations with HHVM's analysis in development and testing, even if the production environment is running stock PHP.
What's needed is a simple compiler that strips Hack's type annotations out, leaving valid PHP behind. Obviously, this would only work on the additions in Hack that are easily stripped out or compiled to PHP, but even if it's just the type annotations, that's still extremely beneficial and in-demand among PHP coders.
token_get_all()
does work on Hack syntax, but tokens aren't that helpful — what we really need to do it right is the parse tree. If you implement the--parse
command-line flag, it would be trivial to write such a tool. (And it probably wouldn't be that much more work to just build this entire feature.)Thanks for your consideration.