Open pionxzh opened 1 year ago
It's usually easier to identify some dependencies by checking them yourself or reading something like LICENSE.txt
.
I don't know if this will help, maybe the developer can identify some dependencies in advance?
/*! For license information please see main.64e92519.js.LICENSE.txt */
/*
object-assign
(c) Sindre Sorhus
@license MIT
*/
/*
object-assign
(c) Sindre Sorhus
@license MIT
*/
/* NProgress, (c) 2013, 2014 Rico Sta. Cruz - http://ricostacruz.com/nprogress
* @license MIT */
/*!
pica
https://github.com/nodeca/pica
*/
/*!
Copyright (c) 2018 Jed Watson.
Licensed under the MIT License (MIT), see
http://jedwatson.github.io/classnames
*/
/*!
* PEP v0.5.1 | https://github.com/jquery/PEP
* Copyright jQuery Foundation and other contributors | http://jquery.org/license
*/
/*!
* The buffer module from node.js, for the browser.
*
* @author Feross Aboukhadijeh <feross@feross.org> <http://feross.org>
* @license MIT
*/
/*!
* The buffer module from node.js, for the browser.
*
* @author Feross Aboukhadijeh <https://feross.org>
* @license MIT
*/
/*! *****************************************************************************
Copyright (c) Microsoft Corporation.
Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH
REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT,
INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR
OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
PERFORMANCE OF THIS SOFTWARE.
***************************************************************************** */
/*! Fabric.js Copyright 2008-2015, Printio (Juriy Zaytsev, Maxim Chernyak) */
/*! ieee754. BSD-3-Clause License. Feross Aboukhadijeh <https://feross.org/opensource> */
/*! js-cookie v3.0.5 | MIT */
/*! regenerator-runtime -- Copyright (c) 2014-present, Facebook, Inc. -- license (MIT): https://github.com/facebook/regenerator/blob/main/LICENSE */
/*! safe-buffer. MIT License. Feross Aboukhadijeh <https://feross.org/opensource> */
/**
* @license
* Copyright 2010-2022 Three.js Authors
* SPDX-License-Identifier: MIT
*/
/**
* @license
* Lodash <https://lodash.com/>
* Copyright OpenJS Foundation and other contributors <https://openjsf.org/>
* Released under MIT license <https://lodash.com/license>
* Based on Underscore.js 1.8.3 <http://underscorejs.org/LICENSE>
* Copyright Jeremy Ashkenas, DocumentCloud and Investigative Reporters & Editors
*/
/**
* @license React
* react-dom.production.min.js
*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/
/**
* @license React
* react-jsx-runtime.production.min.js
*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/
/**
* @license React
* react.production.min.js
*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/
/**
* @license React
* scheduler.production.min.js
*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/
/**
* @license React
* use-sync-external-store-shim.development.js
*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/
/**
* @license React
* use-sync-external-store-shim.production.min.js
*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/
/**
* @license React
* use-sync-external-store-shim/with-selector.production.min.js
*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/
/** @license React v0.20.2
* scheduler.production.min.js
*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/
/** @license React v16.13.1
* react-is.production.min.js
*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/
/** @preserve
* Counter block mode compatible with Dr Brian Gladman fileenc.c
* derived from CryptoJS.mode.CTR
* Jan Hruby jhruby.web@gmail.com
*/
/** @preserve
(c) 2012 by Cédric Mesnil. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
This issue existed because we needed to know which library we were processing to give an appropriate output. License can be a good hint for both humans and wakaru. Modern bundlers often destroy most information, including the method name, so a module/function detection is still required. And this list won't grow brainlessly; we will pick high-value targets. 🙏
Dev can still identify the module by themself and rename the module name.
that can help us transform the code and give the extracted module a better name other than
module-xxxx.js
This could then also tie in well with some of the ideas for 'unmangling identifiers' that I laid out here:
Theoretically if we can identify a common open source module, we could also have pre-processed that module to extract variable/function names, that we could then potentially apply back to the identified module.
I kind of think of this like 'debug symbols' used in compiled binaries.
Though technically, if you know the module and can get the original source; and you know the webpacked version of that code; you could also generate a sourcemap that lets the user map between the 2 versions of the code.
When I was manually attempting to reverse and identify the modules in #40, a couple of techniques I found useful:
Symbol()
s.displayName
and similarEdit: This might not be useful right now, but just added a new section to one of my gists with some higher level notes/thoughts on fingerprinting modules; that I might expand either directly, or based on how this issue pans out:
While it might be more effort than it's worth, it may also be possible to extract the patterns that wappalyzer was using to identify various libraries; which I made some basic notes on in this revision to the above gist:
Within some webpacked code I was looking at (Ref):
We can easily identify a number of the React modules based on their license header; which also includes the original filename:
~/dev/0xdevalias/REDACTED/unpacked/_next/static/chunks/653.js:
13730 "use strict";
13731 /**
13732: * @license React
13733 * react-is.production.min.js
13734 *
~/dev/0xdevalias/REDACTED/unpacked/_next/static/chunks/framework.js:
5 2920: function (e, n, t) {
6 /**
7: * @license React
8 * react-dom.production.min.js
9 *
..
8452 82875: function (e, n, t) {
8453 /**
8454: * @license React
8455 * react-jsx-runtime.production.min.js
8456 *
....
8492 99504: function (e, n) {
8493 /**
8494: * @license React
8495 * react.production.min.js
8496 *
....
8891 95507: function (e, n) {
8892 /**
8893: * @license React
8894 * scheduler.production.min.js
8895 *
~/dev/0xdevalias/REDACTED/unpacked/_next/static/chunks/pages/_app.js:
47741 93802: function (U, B) {
47742 "use strict";
47743: /** @license React v16.13.1
47744 * react-is.production.min.js
47745 *
.....
54586 "use strict";
54587 /**
54588: * @license React
54589 * use-sync-external-store-shim.production.min.js
54590 *
.....
54654 "use strict";
54655 /**
54656: * @license React
54657 * use-sync-external-store-shim/with-selector.production.min.js
54658 *
And at least in this bundled code, statsig-js
seems to make at least it's presence known (though this is the only thing in that module):
export default JSON.parse(
'{"name":"statsig-js","version":"4.32.0","description":"Statsig JavaScript client SDK for single user environments.","main":"dist/index.js","types":"dist/index.d.ts","scripts":{"prepare":"rm -rf build/ && rm -rf dist/ && tsc && webpack","postbuild":"rm -rf build/**/*.map","test":"jest --config=jest-debug.config.js","testForGithubOrRedisEnthusiasts":"jest","test:watch":"jest --watch","build:dryrun":"npx tsc --noEmit","types":"npx tsc"},"files":["build/statsig-prod-web-sdk.js","dist/*.js","dist/*.d.ts","dist/utils/*.js","dist/utils/*.d.ts"],"jsdelivr":"build/statsig-prod-web-sdk.js","repository":{"type":"git","url":"git+https://github.com/statsig-io/js-client-sdk.git"},"author":"Statsig, Inc.","license":"ISC","bugs":{"url":"https://github.com/statsig-io/js-client-sdk/issues"},"keywords":["feature gate","feature flag","continuous deployment","ci","ab test"],"homepage":"https://www.statsig.com","devDependencies":{"@babel/preset-env":"^7.14.9","@babel/preset-typescript":"^7.14.5","@types/jest":"^27.1.0","@types/uuid":"^8.3.1","circular-dependency-plugin":"^5.2.2","core-js":"^3.16.4","jest":"^27.1.0","terser-webpack-plugin":"^5.1.4","ts-jest":"^27.1.0","ts-loader":"^9.2.3","typescript":"^4.2.2","webpack":"^5.75.0","webpack-cli":"^4.10.0"},"dependencies":{"js-sha256":"^0.9.0","uuid":"^8.3.2"},"importSort":{".js, .jsx, .ts, .tsx":{"style":"module","parser":"typescript"}}}'
);
See also:
With regards to module detection/similar for React, these might be interesting/useful:
A tool to update app sourcemaps with the original code of ReactDOM's production builds
This package includes:
- the actual sourcemaps
- logic to search an input sourcemap for specific ReactDOM prod artifacts by content hash and replace them with the "original" pre-minified bundle source via the sourcemaps
- a CLI tool that will load a given input sourcemap file and rewrite it
- a build tool plugin that will automatically replace
react-dom
sourcemaps
I don't know how W3Techs counts but the HTTP Archive Almanac 2022 uses Wappalyzer v6.10.26, whose React version detection logic seems to look for (a) a global React.version property or (b) a version number in a script filename that clearly indicates React. Both of these are very uncommon ways to deploy React these days. Even for detecting React as a whole, it uses data attributes no longer used by React or a _reactRootContainer property that is not added when using modern React APIs such as React 18 createRoot (and only looks for that property on divs that are direct children of
).
I won't copy the content here in full as it was pretty long, but I detailed some of my higher level thoughts around some more 'esoteric' methods that might be applicable to module detection (AST fingerprinting, code similarity, etc) in this comment:
This specific implementation is more related to detecting and injecting into webpack modules at runtime, but it might have some useful ideas/concepts that are applicable at the AST level too:
// ..snip..
export const common = { // Common modules
React: findByProps('createElement'),
ReactDOM: findByProps('render', 'hydrate'),
Flux: findByProps('Store', 'connectStores'),
FluxDispatcher: findByProps('register', 'wait'),
i18n: findByProps('Messages', '_requestedLocale'),
channels: findByProps('getChannelId', 'getVoiceChannelId'),
constants: findByProps('API_HOST')
};
There has recently been a new source of discussion around code fingerprinting and module identification over on the
humanify
repo in this issue:Originally posted by @0xdevalias in https://github.com/pionxzh/wakaru/issues/74#issuecomment-2372650986
Similar to what we alr have for babel runtime detection, consider introducing a module code detection that can help us transform the code and give the extracted module a better name other than
module-xxxx.js
.