php / php-src

The PHP Interpreter
https://www.php.net
Other
38.17k stars 7.75k forks source link

Avoid circular module dependencies #15580

Open cmb69 opened 2 months ago

cmb69 commented 2 months ago

Description

This issue has been triggered by https://github.com/php/php-src/pull/15548#discussion_r1728965655 and https://github.com/php/php-src/pull/15522#issuecomment-2305279764.

To clarify the terminology: "modules" are not referring to compilation units (with their headers), but rather the bigger picture (e.g, Zend, TSRM etc.); and "dependencies" refer to compile time (not runtime) dependencies.

While obviously not required (C and our build system is fine with circular module dependencies), it might make sense to restrict module dependencies so they can represented as directed acyclic graph; that may at least help conceptually. Supposedly, a simple (sorry, I'm pretty unexperienced with Memaid syntax) dependency graph could like like the following ("PHP" means rest of php-src):

flowchart TD
  Zend --> TSRM
  streams --> TSRM
  streams --> Zend
  PHP -->TSRM
  PHP --> Zend
  PHP --> streams

However, TSRM already depends on main/php_config.h (which is not that bad since it is only a configuration header), but already Zend (the Zend Engine, to be precise) depends on quite some stuff in main/, and due to zend_system_id even on ext/standard and ext/hash. And main/streams/ depends on even more ext/standard headers.

Now the question is, are we fine with the current state (i.e. to have circular dependencies for the sake of simplicity, and not having to change anything), or do we want to "improve" this? If we're strict about avoiding circular dependencies, we could (at least in theory) even build the modules as shared libraries.

petk commented 2 months ago

The TSRM was also planned to be integrated into Zend engine years ago. Can we also remove the TSRM directory?

nielsdos commented 2 months ago

@petk Do you have a link to that discussion somewhere?

petk commented 2 months ago

See this: https://github.com/php/php-src/pull/3459 and then there was another one with @nikic somewhere...

cmb69 commented 2 months ago

I found https://github.com/php/php-src/pull/5562#discussion_r424244626 (probably not what @petk was referring to).

Generally I think we should properly discuss on the internals mailing whether we want to integrate TSRM into the ZE, or keep them separate.

petk commented 2 months ago

Of course. Point is just to notify you of this. And the TSRM authors need to be written in the credits also in any way :)

And why is this important? Because this graph is actually not correct. TSRM already cannot be treated as a standalone module here. It can already be treated as just an object that is belonging to Zend. That's how I'll resolve this in CMake soon this with object library added to Zend.

cmb69 commented 2 months ago

It looks like that the Zend/ directory was not contained in the php-src repo during the PHP4 days, but was only added to the PHP4 release tarballs (the last commit on the PHP-4.4 branch confirms that); it likely had been developed in another repository owned by Zend. However, the TSRM/ directory was part of php-src, so the strict separation was almost necessary back then.

For PHP5, the Zend/ directory has been integrated into php-src, but apparently the Zend/TSRM separation has been kept (it looks like TSRM had it's own build support). Apparently, over time most of the build stuff has been removed from TSRM (perhaps because it was broken). And since this had already started years ago, but nobody complained, we can assume that nobody else but PHP uses the TSRM. As such it may make sense to treat TSRM as part of the Zend engine.

Still, this looks messed up. On platforms other than Windows, TSRM depends on main/php_config.h (like I've already said in the OP), but on Windows it uses Zend/zend_config.w32.h instead (apparently there is no Zend/zend_config.h). But worse, on that platform, TSRM depends on stuff in win32 (what might be okay), but also uses zend_virtual_cwd.h (okay if we treat TSRM and the Zend engine as a single module), but also apparently depends on SAPI.h.

So the question is still: are we fine with these circular module dependencies, or not?

petk commented 2 months ago

So the question is still: are we fine with these circular module dependencies, or not?

Well, I'm not fine with circular dependencies. They are architectural design issue and shouldn't be done at all.

apparently there is no Zend/zend_config.h

There is Zend/zend_config.h generated if it helps, which could be replaced here I think, so there's one issue less.

--- a/TSRM/TSRM.h
+++ b/TSRM/TSRM.h
@@ -17,7 +17,7 @@
 # define TSRM_WIN32
 # include <Zend/zend_config.w32.h>
 #else
-# include <main/php_config.h>
+# include <Zend/zend_config.h>
 #endif

 #include <stdint.h>

But also Zend engine heavily depends on PHP's ext/standard and some ext/hash.

cmb69 commented 2 months ago

Well, I'm not fine with circular dependencies. They are architectural design issue and shouldn't be done at all.

I agree.

There is Zend/zend_config.h generated if it helps, which could be replaced here I think, so there's one issue less.

Ah, nice!

But also Zend engine heavily depends on PHP's ext/standard and some ext/hash.

Right (I've already said that in the OP).

But worse, on that platform, TSRM depends […] on SAPI.h.

And that is what happens if you don't enforce non-circular module dependencies: https://github.com/php/php-src/commits/64934cf3607d03318efdb170dafcd9a92de48f1a

Girgias commented 2 months ago

I agree with the premise here in general.

Getting rid of ext/hash dependencies should be rather easy, as it always uses one utility function. ext/standard might be a tad harder as it is for MD5 which could be solved either by moving it into Zend or into a new "outside" folder.

For the virtual CWD I don't really understand why this is in Zend in the first place, as it seems to be mainly "isolated" from it in the first place.

cmb69 commented 2 months ago

Getting rid of ext/hash dependencies should be rather easy, as it always uses one utility function. ext/standard might be a tad harder as it is for MD5 which could be solved either by moving it into Zend or into a new "outside" folder.

Yeah, I think the low level md5 implementation (not the userland functions) should be moved into Zend/. Then we could also use make_digest_ex() instead of php_hash_bin2hex() to get rid of the ext/hash dependency.

For the virtual CWD I don't really understand why this is in Zend in the first place, as it seems to be mainly "isolated" from it in the first place.

Me neither, but if we regard Zend/ and TSRM/ as single module (see Peter's comment above), than it doesn't really matter.

Girgias commented 1 month ago

The Virtual CWD currently has dependencies on Win32, is VCWD only ever used as part of TSRM?

cmb69 commented 1 month ago

zend_virtual_cwd.c was originally trsm_virtual_cwd.c (e30b2aae5ad93405eee19bca6125ea872f409c82); why it had been moved – I don't know.

Girgias commented 1 month ago

Btw @cmb69 this PR is somewhat related: https://github.com/php/php-src/pull/15520