aboutcode-org / fetchcode

A library to reliably fetch code via HTTP, FTP and version control systems. This project is sponsored by NLnet project https://nlnet.nl/project/vulnerabilitydatabase/ Google Summer of Code, nexB and others generous sponsors!
9 stars 18 forks source link

Reuse/copy pip code for VCS and download #1

Open pombredanne opened 5 years ago

pombredanne commented 5 years ago

pip is a good starting point as https://github.com/pypa/pip/blob/master/src/pip/_internal/download.py is a solid and reliable download utility tested with billions of downloads.

There are a few ways to handle this:

  1. use https://github.com/sarugaku/pip-shims and reuse pip code
  2. copy and fork pip code
  3. vendor pip code

Note pip also handles VCS URLs See https://github.com/pypa/pip/tree/master/src/pip/_internal/vcs The download location specified in SPDX is mostly derived from the pip URLs https://github.com/spdx/spdx-spec/blob/db06dc81e525e08035af34117127742337e1f1b6/chapters/3-package-information.md#37-package-download-location-

pip does not handle ftp AFAIK

TG1999 commented 5 years ago

Hi, can you tell me some steps that I should follow to solve this issue.

TG1999 commented 5 years ago

From your description I understand that I should make a download.py file that should be like https://github.com/pypa/pip/blob/master/src/pip/_internal/download.py inside fetchcode. Are the steps correct?

pombredanne commented 5 years ago

Now that we have an API the next step is this ticket.

TG1999 commented 4 years ago

https://github.com/pypa/pip/blob/master/src/pip/_internal/download.py this is missing now, I am not able to open it

pombredanne commented 4 years ago

It was there https://github.com/pypa/pip/blob/13ab7a2bce8fcde72b722cfc803d34671f1cd855/src/pip/_internal/download.py but has now been exploded in this https://github.com/pypa/pip/tree/master/src/pip/_internal/network

This parts that deal with VCS is more stable https://github.com/pypa/pip/tree/master/src/pip/_internal/vcs

In anycase we will have to fork as there is no stable code API in pip

TG1999 commented 4 years ago

A little doubt, do we expect user to have installed git, mercurial or whichever VCS repo user wants to download.

pombredanne commented 4 years ago

@TG1999 yes. It would be nice if we cant use some library for Git, but reusing the command line should be simpler and this is how pip works and since this can be tricky code, best to reuse it

pombredanne commented 4 years ago

@pombredanne wrote

hey :) let me check in details what pip bit we could reuse IMHO the vcs module is really worth it. It it a well tested command line wrapper for git/hg/svn etc... See https://github.com/pypa/pip/blob/e79fa0e8a249c3b3c6711e2ad85b0235a8a5d70a/src/pip/_internal/vcs/versioncontrol.py#L506 which is the entry point IMHO

@TG1999 replied

Okay I am getting you. So please correct my steps First I should fork this pip repo Then copy the code inside repo as it is in root directory Then What should I do after that ? Because the entry point versioncontrol.py needs some inner modules present inside pip so I have to copy the whole code then Or only from the src/pip

TG1999 commented 4 years ago

Hi, please can you explain me the steps :)

pombredanne commented 4 years ago

@TG1999 the things to do could be on of these:

  1. reuse @techalchemy https://github.com/sarugaku/pip-shims
  2. create a wrapper (likely reusing pip-shims) that would also vendor pip so we have this really reusable as a library
  3. carefully extract the subset we care for (mostly the vcs module and its tests and deps) and just fork it
pombredanne commented 4 years ago

An alternative could be to look at https://github.com/juju/charm-helpers/tree/master/charmhelpers/fetch that has Git, bzr, http (and claims FTP support)

TG1999 commented 4 years ago

Agreed @pombredanne I will send a PR by this weekend

TG1999 commented 4 years ago

@pombredanne I think we can close this now :)