cerebis / sim3C

Read-pair simulation of 3C-based sequencing methodologies (HiC, Meta3C, DNase-HiC)
GNU General Public License v3.0
19 stars 5 forks source link

wrong type passed #6

Closed zeeev closed 6 years ago

zeeev commented 6 years ago

Hi Matthew,

Thanks for writing sim3C, it's super useful. I think I found an easy bug to fix. I had two sequences that didn't have cut sites.

Class CutSites is passed as Seq.seq, so raise NoCutSitesException(template_seq.id, str(enzyme)) fails because there is no attribute id. Changing the type passed fixes the problem.

Thanks,

Zev

cerebis commented 6 years ago

Hello @zeeev , sorry for the slow reply and many thanks for the bug report. I will look into this shortly.

cerebis commented 6 years ago

From your description, I am not sure how you are using Sim3C.

Can I ask if you are using the code as a module in your own software? The expectation within class CutSite is that the parameter template_seq is a Bio.Seq object, therefore it should have the .id attribute.

I do not often use asserts in the code to enforce type checking, so these sorts of errors will occur if classes are passed incorrect types.

zeeev commented 6 years ago

Hi Matthew,

I started by simply executing the program to simulate hi-c data, hit a bug, and then started digging. The git-diff below shows the changes I made to fix the error. The template_seq object wasn't being passed, but rather template_seq.seq to CutSites. Maybe it's a difference between my version of Biopython and yours?

Anyway, after the modification everything worked great. Again, thanks for this very useful piece of code.

Incorrect type passed:

-            self.sites = CutSites(enzyme, seq.seq, self.random_state, linear=linear)
         self.max_length = len(template_seq) - 1

         # find sites, converting from 1-based.
-        self.sites = np.array(enzyme.search(template_seq, linear)) - 1
+        self.sites = np.array(enzyme.search(template_seq.seq, linear)) - 1
         self.size = self.sites.shape[0]
         if self.size == 0:
+            print '{0}\n'.format(template_seq)
             raise NoCutSitesException(template_seq.id, str(enzyme))

         # method setup
@@ -347,7 +348,7 @@ class Replicon:
         if not enzyme:
             self.sites = AllSites(len(seq.seq), self.random_state)
         else:
-            self.sites = CutSites(enzyme, seq.seq, self.random_state, linear=linear)
+            self.sites = CutSites(enzyme, seq, self.random_state, linear=linear)

         self.length = len(self.seq)
         self.num_sites = self.sites.size
cerebis commented 6 years ago

Thanks for the report.