theislab / sfaira

data and model repository for single-cell data
https://sfaira.readthedocs.io
BSD 3-Clause "New" or "Revised" License
134 stars 11 forks source link

Bandit identifies many security issues #73

Open Zethson opened 3 years ago

Zethson commented 3 years ago

We can ignore the ones that we don't deem important, but here's the list for now:

Test results:
>> Issue: [B310:blacklist] Audit url open for permitted schemes. Allowing use of file:/ or custom schemes is often unexpected.
   Severity: Medium   Confidence: High
   Location: sfaira/data/dataloaders/loaders/d10_1038_s41586_020_2157_4/base.py:53
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b310-urllib-urlopen
52          # download required files from loaders cell landscape publication data: https://figshare.com/articles/HCL_DGE_Data/7235471
53          print(urllib.request.urlretrieve(
54              'https://ndownloader.figshare.com/files/17727365',
55              os.path.join(self.path, "human", self._directory_formatted_doi, 'HCL_Fig1_adata.h5ad')
56          ))

--------------------------------------------------
>> Issue: [B310:blacklist] Audit url open for permitted schemes. Allowing use of file:/ or custom schemes is often unexpected.
   Severity: Medium   Confidence: High
   Location: sfaira/data/dataloaders/loaders/d10_1038_s41586_020_2157_4/base.py:57
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b310-urllib-urlopen
56          ))
57          print(urllib.request.urlretrieve(
58              'https://ndownloader.figshare.com/files/21758835',
59              os.path.join(self.path, "human", self._directory_formatted_doi, 'HCL_Fig1_cell_Info.xlsx')
60          ))

--------------------------------------------------
>> Issue: [B310:blacklist] Audit url open for permitted schemes. Allowing use of file:/ or custom schemes is often unexpected.
   Severity: Medium   Confidence: High
   Location: sfaira/data/dataloaders/loaders/d10_1038_s41586_020_2157_4/base.py:61
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b310-urllib-urlopen
60          ))
61          print(urllib.request.urlretrieve(
62              'https://ndownloader.figshare.com/files/22447898',
63              os.path.join(self.path, "human", self._directory_formatted_doi, 'annotation_rmbatch_data_revised417.zip')
64          ))

--------------------------------------------------
>> Issue: [B310:blacklist] Audit url open for permitted schemes. Allowing use of file:/ or custom schemes is often unexpected.
   Severity: Medium   Confidence: High
   Location: sfaira/estimators/keras.py:90
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b310-urllib-urlopen
89              try:
90                  urllib.request.urlretrieve(self.model_dir,
91                                             os.path.join(self.cache_path, os.path.basename(self.model_dir))
92                                             )

--------------------------------------------------
>> Issue: [B310:blacklist] Audit url open for permitted schemes. Allowing use of file:/ or custom schemes is often unexpected.
   Severity: Medium   Confidence: High
   Location: sfaira/estimators/keras.py:96
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b310-urllib-urlopen
95                  try:
96                      urllib.request.urlretrieve(urljoin(self.model_dir, f'{self.model_id}_weights.h5'),
97                                                 os.path.join(self.cache_path, f'{self.model_id}_weights.h5')
98                                                 )

--------------------------------------------------
>> Issue: [B310:blacklist] Audit url open for permitted schemes. Allowing use of file:/ or custom schemes is often unexpected.
   Severity: Medium   Confidence: High
   Location: sfaira/estimators/keras.py:102
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b310-urllib-urlopen
101                     try:
102                         urllib.request.urlretrieve(urljoin(self.model_dir, f'{self.model_id}_weights.data-00000-of-00001'),
103                                                    os.path.join(self.cache_path, f'{self.model_id}_weights.data-00000-of-00001')
104                                                    )

--------------------------------------------------
>> Issue: [B303:blacklist] Use of insecure MD2, MD4, MD5, or SHA1 hash function.
   Severity: Medium   Confidence: High
   Location: sfaira/estimators/keras.py:159
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b303-md5
158         with open(fn, 'rb') as f:
159             hsh = hashlib.md5(f.read()).hexdigest()
160         if not hsh == target_md5:

--------------------------------------------------
>> Issue: [B303:blacklist] Use of insecure MD2, MD4, MD5, or SHA1 hash function.
   Severity: Medium   Confidence: High
   Location: sfaira/interface/user_interface.py:139
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b303-md5
138                     with open(os.path.join(subdir, file), 'rb') as f:
139                         md5.append(hashlib.md5(f.read()).hexdigest())
140         s = [i.split('_')[0:7] for i in file_names]

--------------------------------------------------
>> Issue: [B311:blacklist] Standard pseudo-random generators are not suitable for security/cryptographic purposes.
   Severity: Low   Confidence: High
   Location: sfaira/models/made.py:111
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b311-random
110             # Disallow unconnected units by sampling min from previous layer
111             input_sel = [randint(np.min(prev_sel), shape[-1] - 2) for i in range(shape[-1])]
112 

--------------------------------------------------
>> Issue: [B301:blacklist] Pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue.
   Severity: Medium   Confidence: High
   Location: sfaira/train/summaries.py:158
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b301-pickle
157                     with open(fn_history, 'rb') as f:
158                         histories[x] = pickle.load(f)
159                 else:

--------------------------------------------------
>> Issue: [B301:blacklist] Pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue.
   Severity: Medium   Confidence: High
   Location: sfaira/train/summaries.py:165
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b301-pickle
164                     with open(fn_eval, 'rb') as f:
165                         evals[x] = pickle.load(f)
166                 else:

--------------------------------------------------
>> Issue: [B301:blacklist] Pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue.
   Severity: Medium   Confidence: High
   Location: sfaira/train/summaries.py:172
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b301-pickle
171                     with open(fn_hp, 'rb') as f:
172                         hyperpars[x] = pickle.load(f)
173                 else:

--------------------------------------------------
>> Issue: [B301:blacklist] Pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue.
   Severity: Medium   Confidence: High
   Location: sfaira/train/summaries.py:179
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b301-pickle
178                     with open(fn_mhp, 'rb') as f:
179                         model_hyperpars[x] = pickle.load(f)
180                 else:

--------------------------------------------------
>> Issue: [B301:blacklist] Pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue.
   Severity: Medium   Confidence: High
   Location: sfaira/train/summaries.py:598
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b301-pickle
597             with open(f"{file_path_base}_model_hyperparam.pickle", 'rb') as file:
598                 hyparam_model = pickle.load(file)
599 

--------------------------------------------------
>> Issue: [B301:blacklist] Pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue.
   Severity: Medium   Confidence: High
   Location: sfaira/train/summaries.py:602
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b301-pickle
601             with open(f"{file_path_base}_hyperparam.pickle", 'rb') as file:
602                 hyparam_optim = pickle.load(file)
603 

--------------------------------------------------
>> Issue: [B301:blacklist] Pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue.
   Severity: Medium   Confidence: High
   Location: sfaira/train/summaries.py:640
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b301-pickle
639         with open(fn, 'rb') as f:
640             ids = pickle.load(f)
641         return ids

--------------------------------------------------
>> Issue: [B301:blacklist] Pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue.
   Severity: Medium   Confidence: High
   Location: sfaira/train/summaries.py:1378
   More Info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b301-pickle
1377                with open(os.path.join(resultspath, f'{model_id}_grads.pickle'), 'rb') as f:
1378                    gradients_raw = pickle.load(f)
1379            else:

--------------------------------------------------

Code scanned:
    Total lines of code: 26483
    Total lines skipped (#nosec): 0

Run metrics:
    Total issues (by severity):
        Undefined: 0.0
        Low: 1.0
        Medium: 16.0
        High: 0.0
    Total issues (by confidence):
        Undefined: 0.0
        Low: 0.0
        Medium: 0.0
        High: 17.0
Files skipped (0):
Zethson commented 3 years ago

Now using a rather large skip list:

skips: ['B101', 'B403', 'B404', 'B603', 'B607', 'B301', 'B303', 'B311', 'B310']