Open PabloMoret opened 1 year ago
I only have a few minutes now, so I'll start by responding to what I can. There's a lot to respond to, so it's probably best to just start where I can!
According to the project instructions, CropRunner is somehow outdated (actually the last commit was one year ago)
The recent commits there are upgrading it from Python2 to Python3, though I don't believe that it was well tested even while making that update. There may just be some small syntax updates to make, which we would of course welcome your help with!
you are currently working on a new version. I'd like to know how's this update going
There are two things to note here. The first is that we have some work ongoing here, which is currently in flux and the repo is going to be moved at some point soon!
The other is that one big reason for writing a new cropper is that we've had some issues with the underlying data that we were generating from Project Sidewalk, and we were trying to fix them post-hoc with the cropper. Issue #8 discusses some of that. I did just recently figure out some fixes on the Project Sidewalk end, you can see my comment here. I haven't fixed the data yet... I'm expecting to be able to do that in the next 2-3 weeks. At which point, the cropper that we have here may be the one worth using! So maybe just stick with this one for now if you can get it to work?
Sorry that I didn't have time to get through much today, but I'll respond to more next week!
Hi @misaugstad
Thank you for your reply.
I only have a few minutes now, so I'll start by responding to what I can. There's a lot to respond to, so it's probably best to just start where I can!
Totally understandable. I guess you guys are so overloaded and I also know my issue covers so many topics, so we can go step by step.
The recent commits there are upgrading it from Python2 to Python3, though I don't believe that it was well tested even while making that update. There may just be some small syntax updates to make, which we would of course welcome your help with!
Sure, that's what I read from issue #8, but I believe if @robertyoung2 made it to run I can do it too. Of course I can help with the code, however first of all I may need to understand how ProjectSideWalk Web Page works so it's consistent. And there's also another concern: I'm just a bachelor student, so I'll try my best in coding (you probably be more code skilled).
There are two things to note here. The first is that we have some work ongoing here, which is currently in flux and the repo is going to be moved at some point soon!
Didn't know about this repo. Got to read it and study it deeply first.
The other is that one big reason for writing a new cropper is that we've had some issues with the underlying data that we were generating from Project Sidewalk, and we were trying to fix them post-hoc with the cropper. Issue https://github.com/ProjectSidewalk/sidewalk-panorama-tools/issues/8 discusses some of that. I did just recently figure out some fixes on the Project Sidewalk end, you can see my comment https://github.com/ProjectSidewalk/SidewalkWebpage/issues/2485#issuecomment-1482054004. I haven't fixed the data yet... I'm expecting to be able to do that in the next 2-3 weeks. At which point, the cropper that we have here may be the one worth using! So maybe just stick with this one for now if you can get it to work?
Ok, actually some of the issues I posted here regard the issue #8 and the cropper problem. Meanwhile I'll be downloading the whole pano dataset and trying the old cropper @robertyoung2 updated to see whether works fine or not for me.
Sorry that I didn't have time to get through much today, but I'll respond to more next week!
No problem, take your time, I know this is huge.
As far as I understand, both metadata and depth data were/are used to: Set the lat/lng coordinates (https://github.com/ProjectSidewalk/SidewalkWebpage/issues/2374#issue-744743551, https://github.com/ProjectSidewalk/SidewalkWebpage/issues/2362#issue-741905788) to labels using decoded depth information. And also: To "equally" crop panos of all labels since labels from different distances yield different perspectives and may influence in Computer Vision applications. Otherwise results might be biased. Is that true?
I believe that in the past, we were using depth data for both of these. And now we are using only the metadata we have available for both. For the cropper, it looks like the code to figure out the size of the crops can be found here.
In the issue https://github.com/ProjectSidewalk/SidewalkWebpage/issues/2374#issuecomment-735899524 you talk about the ground truth. By googling it, says it's data or information to be real or true. However, I just wanted to make sure has nothing to do with GSV ground lat/lng position. I guess when you talk about ground truth you just mean lat/lng and GSV info gathered directly via google API. No info inferred from it.
I am not 100% sure what you're asking, but I will try to summarize what we meant. Let me know if you need any other clarification!
We wanted to get an accurate lat/lng estimation for labels placed through Project Sidewalk without using depth data. The most accurate data that we have on lat/lngs for labels are the numbers that we derived using GSV's depth data in the past. So we treated the locations we found using depth data in the past as "ground truth", and trained regressions aiming for that.
canvas_x
andcanvas_y
represent the coordinates (x & y values) of the label within the canvas (720x480 exploring box of a pano: cropped pic where the tag's placed) in ProjectSidewalk web page. ...sv_image_x
&sv_image_y
are the coordinates (x & y values) of the label within the pano.
Correct!
Therefore,
canvas_x
&canvas_y
andsv_image_x
&sv_image_y
are essentially the same (equivalent) but in different systems: x & y position values in canvas and pano respectively.
Basically correct! For some reason, sv_image_y
is actually defined as the number of pixels from the center of the image... So to get the true sv_image_y
, you take sv_image_width/2 - sv_image_y
... But the cropping code takes that into account right now. And I will be fixing this in the next couple weeks so that sv_image_y
is actually the y-coordinate.
Note: I assumed (based on CropRunner and how labels are placed using ProjectSidewalk web labeling tool) any label is not strictly in the middle of the crop. Please let me know if otherwise.
I believe that labels are actually in the middle of crops create by CropRunner. Unfortunately I haven't actually looked at data generated from it in awhile, but I believe that when we take the crops, we are trying to center the label.
We are using the canvas_x/y
values to recreate the scene that a user was looking at when they labeled in other tools like Sidewalk Gallery, but I believe that we are aiming for the label being centered for CV.
As far as I undestand (again refering to the same https://github.com/ProjectSidewalk/sidewalk-panorama-tools/issues/8#issuecomment-772052734 as above), the distance is the segment between the GSV car that took the pano (actually the lat/lgn position of the pano) and the label itself. Is it?
Correct! Is this data that you think you need?
Heading: Is it the x position of the label? Is it the x position of the center of the canvas (720x480) instead?
The latter. It's the heading (in degrees) of the center of the canvas with respect to true north.
What's the difference between the heading in samples/labeldata.csv, https://github.com/ProjectSidewalk/sidewalk-panorama-tools/issues/8#issuecomment-781741256 or any other csv file containing metadata AND the heading computed with the regression method (difference in heading).
Since the heading
we have is the center of the canvas, that difference in heading when doing the regression is trying to figure out where the heading of the label is. So that different would be 0 if the label was on the center of the canvas.
Photographer heading: Honestly I have no idea what's this variable
The photographer_heading
is the heading (with respect to true north) of the camera when the image was taken. So it is the heading of the center of the entire panorama image.
what's the reason behind substracting photographer_heading to and then using it to calculate label x position? Is it because photographer_heading is displaced? Why not just using photographer_heading or heading directly?
Oh! I think that this might be an attempt fix an issue with sv_image_x
that comes from Project Sidewalk. Right now, sv_image_x
numbers are wrong because they aren't taking into account the photographer_heading
. So this must be the fix on the cropping side. I recently submitted this PR which fixes sv_image_x
... So my assumption is that you should no longer need to calculate x
using the photographer_heading
and im_width
anymore.
Note that I have this fixed for new data coming in, but I still need to retroactively fix the old data. Should do it in the next two weeks, but it's in flux now!
And as I mentioned above, the calculation for y = im_height / 2 - sv_image_y
) won't be necessary anymore once I make these changes.
Okay that's what I've got for now! Hoping to finish up responding later in the week. And hopefully some of what I posted here clears up other things I haven't gotten to yet as well :)
First of all I'd like to thank your reply. Didn't expect an answer so quick.
Sure, I know now you're using the regression model to compute distance and "equally" crop labels. This issue cleared up.
Totally what I meant. No further clarification! :)
Here I want to begin with Issue True North first, since it's a really important concept to have cleared up, so there's no misunderstanding in how other values are computed.
I'd like to make this clear in the first place.
While I was waiting for your reply, I took a look at some comments, specifically to #2485 and certain comments in #1708.
I'm aware it's quite an old post but based on @kavidey's comment I've discover there's a true north direction in panoramas. Also mentioned by you:
... It's the heading (in degrees) of the center of the canvas with respect to true north.
Also following the comment's statement:
sv_x
is distance from true north to label in px.sv_y
is the distance from horizon to label in px.photographerHeading
is the angle distance from true north to where the photographer was heading.heading
is the angle between true north and point (probably refeered to label).However based on info above some other questions arise (the first 3 questions have concepts in common, so you may read them all together so makes sense):
Following the principle It ranges from -(imgWidth/2) to (imgWidth/2) where 0 is true north
here's what I've come up with: Instead of representing coordinate $(0,0)$ or $0º$ in the left most side of the $x$ axis (the left bottom corner of the pano), the origin is in the middle of the pano ($x$ axis), therefore the true north is in the middle of the pano as well, and finally it can be assumed the left most side is coordinate -imageWidth / 2
or $-180º$ and right most side represent +imageWidth / 2
or $+180º$.
The problem with true north is that, assuming it lies at the center of a pano (so every GSV pano is fixed), pano_yaw_deg - 180
makes sense because would yield $0º$, cause it's 0px (left bottom corner), but metadata has different values for photographer_heading
(differ from $180º$), like 89.32º
so 89.32 - 180
does not yield 0º
, which would be 0px from the pano (left bottom corner).
Therefore the previous insight about origin and coordinates would be wrong.
Is sv_x
the same as sv_image_x
and the same for the y
component?
Maybe it's because it's an old post, but I guess sv_x
and sv_y
refeer to sv_image_x
and sv_image_y
respectively.
What is the relation between pano px coordinates and values in sexagesimal system?
More questions regarding positions with respect to true north in the following issues.
Actually I meant exploring box, not label crop, since the latter lies exactly in the middle.
... So to get the true sv_image_y, you take sv_image_width/2 - sv_image_y... But the cropping code takes that into account right now. And I will be fixing this in the next couple weeks so that sv_image_y is actually the y-coordinate.
Got that. So true horizon being at the middle of the pano could possibly be the reason of substracting half the pano's heigth (I guess you meant height), right?
y = im_height / 2 - sv_image_y
I believe that labels are actually in the middle of crops create by CropRunner. Unfortunately I haven't actually looked at data generated from it in awhile, but I believe that when we take the crops, we are trying to center the label.
We are using the canvas_x/y values to recreate the scene that a user was looking at when they labeled in other tools like Sidewalk Gallery, but I believe that we are aiming for the label being centered for CV.
Totally right. My fault. I didn't express myself well.
Here:
canvas_x and canvas_y represent the coordinates (x & y values) of the label within the canvas (720x480 exploring box of a pano: cropped pic where the tag's placed)
and
Note: I assumed (based on CropRunner and how labels are placed using ProjectSidewalk web labeling tool) any label is not strictly in the middle of the crop. Please let me know if otherwise.
in these two paragraphs, I actually meant the exploring box (canvas), not the crop.
Apparently, indeed following to your answer:
... but I believe that we are aiming for the label being centered for CV.
the label lies exactly in the middle of the crop based on code:
top_left_x = x - crop_width / 2
top_left_y = y - crop_height / 2
cropped_square = im.crop((top_left_x, top_left_y, top_left_x + crop_width, top_left_y + crop_height))
Correct! Is this data that you think you need?
I don't know yet. Maybe. I'm still recreating the dataset and doing research for other Deep Learning approaches from the existing ones.
I'll cover photographerHeading
first.
The photographer_heading is the heading (with respect to true north) of the camera when the image was taken. So it is the heading of the center of the entire panorama image.
Ok.
However, assuming true north is on the middle of the pano, and assuming question 2 in Issue True North that GSV camera focuses on true north, thus photographerHeading
should always be $0$ since true north and the center of GSV camera are alligned, right?
How photographerHeading
have values other than $0$?
Despite the first question, considering this value's not fixed, is photographerHeading
value between $-180º$ and $+180º$ because of true north?
The latter. It's the heading (in degrees) of the center of the canvas with respect to true north.
Good so far.
However, true north needs to be correctly defined so heading
computation makes sense.
Does the same happen as photographerHeading
to heading
? i.e. Considering heading
value starts from the left most side of the pano, is the value between $-180º$ and $+180º$ because of true north?
I've got all these questions because I want to understand how heading
, photographerHeading
, canvas_x
, canvas_y
are retrieved in WebPage so then sv_image_
's are computed.
I also want to apologize for this huge reply and this mess of replies.
Note: I'm trying to read and understand Project SideWalk Webpage but it's a mess for me.
Once I've got all answers I can post a huge comment with all explanations, so anyone reading in TLDR can check every concept.
While I was waiting for your reply I thought I could use data from any other city that's already correct (don't know which one has greatest dataset, followed by Seattle) so I can still work in my project without data consistency problems. So just couple simple questions:
Hi, a quick update! Last week I finished updating all of the APIs on the Project Sidewalk side of things. Then I updated DownloadRunner.py
and CropRunner.py
here to use the updated APIs correctly (PR #24). I then also tested out the cropper to make sure that it's working correctly, and removed a bunch of code that isn't in use to make the repository more easily understandable (#26).
What city in your dataset is complete and has no image size or positional issues?
I believe that image size issues have been resolved, so any city should work!
Can I trust and use croppRunner (even though I don't fully understand know how it works atm)?
I just tested it last Thursday, so I believe so! That being said, I only tested it on a handful of images. So if you test on a larger dataset and find errors, do send them my way so I can investigate!
I believe that image size issues have been resolved, so any city should work!
Any city except for DC! :joy: Waiting for some help to get that server up-to-date :) Newberg might be a nice dataset to test on? It is a smaller city so it has less data to download, making it a nice city to test in IMO. Though you should expect a decent amount of the panoramas to no longer be available!
Hi, a quick update! Last week I finished updating all of the APIs on the Project Sidewalk side of things. Then I updated
DownloadRunner.py
andCropRunner.py
here to use the updated APIs correctly (PR #24) ...
Good to hear that. I've checked that out, however, I still need to understand all the concepts, so I know how the old and new math/code works.
... I then also tested out the cropper to make sure that it's working correctly, and removed a bunch of code that isn't in use to make the repository more easily understandable (#26).
Got that. I'll test cropRunner
with the new code.
... I believe that image size issues have been resolved, so any city should work! ...
Sure, as far as I know image_width
& image_height
(new pano_width
& pano_height
) are now being recorded correctly. I assume metadata width & height in server are also updated.
... I just tested it last Thursday, so I believe so! That being said, I only tested it on a handful of images. So if you test on a larger dataset and find errors, do send them my way so I can investigate! ...
Thanks for your test. I'll do it.
Any city except for DC! :joy: ...
Crap! Actually I have the whole DC pano dataset already downloaded :cry:
Anyway. Just wanted to know. As far as I've read in other related issues (quite old ones, but I think they still haven't been addressed yet), as a sum up, sv_image_x
& sv_image_y
are still wrong in DC (link to reference). What about the following? How is this issue doing right now?:
Just want to know the current situation of metadata.
... Waiting for some help to get that server up-to-date :) ...
I may help coding but still don't know how all variables work and what is the real current situation of all metadata issues. While waiting I'll be testing crops with basic knowledge. Additionally, I'll need a csv update with the metadata to crop, until this is fixed.
Newberg might be a nice dataset to test on? It is a smaller city so it has less data to download, making it a nice city to test in IMO. ...
Seems fair, I'll start with Newberg instead.
... Though you should expect a decent amount of the panoramas to no longer be available!
True, as @robertyoung2 said, roughly ~50% of the panos are dead. I wanted to cover this issue some time ago, since I read this:
"F:".concat(window.location.href.split("!1s")[1].split("!2e")[0]).replace('%2F','/')
Also tried to reach @robertyoung2 but coudn't get in touch with him. I have neither found a way to recover old pano_ids through researching on the internet
Maybe to keep up with the concepts, as I know you're pretty slammed working on Sidewalk Webpage, I can post single doubt at a time. Once it's cleared then I can move to the following one. Is it fair for you?
Feel free to say whatever you think. It's just another simple way to work on this issue :relaxed:
I assume metadata width & height in server are also updated.
Yes, these should be up-to-date!
I have the whole DC pano dataset already downloaded ... as a sum up, sv_image_x & sv_image_y are still wrong in DC (https://github.com/ProjectSidewalk/sidewalk-panorama-tools/issues/8#issuecomment-821555408 to reference). What about the following? How is this issue doing right now? ... Just want to know the current situation of metadata.
Correct, sv_image_x/y
is still incorrect in DC, and the names of the variables haven't been changed, etc. But I wouldn't bother thinking about the state that it's in now. I expect it to be up-to-date by the end of the week. The code has been fixed, it's a matter of deploying the code to the server where we're having a problem.
Some of them have been labeled "randomly" by people. Can be omitted by just filtering out labels with "downvotes" as you stated.
Note that DC was our pilot deployment years ago, before we added an interface for validations. So we don't actually have validation information in DC. As such, DC is probably the worst city to test on :sweat_smile: That should probably be noted in the documentation as well...
Additionally, I'll need a csv update with the metadata to crop, until https://github.com/ProjectSidewalk/sidewalk-panorama-tools/issues/25#issuecomment-1519699415 is fixed.
You had generously offered in #25 to make this update and submit a PR. Is this something that you would be willing to do now instead of me getting a CSV together? There is a small sample CSV that you can test with that is part of the repo here. And you shouldn't need a CSV with a full dataset for a city to be able to test out the migration over to using an API endpoint. I'm just very busy and would need to rewrite the query to get the CSV... And the point of using the API is so that I don't have to do that :grin:
... Though you should expect a decent amount of the panoramas to no longer be available!
True, as @robertyoung2 said, roughly ~50% of the panos are dead ... I have neither found a way to recover old pano_ids through researching on the internet
Some panoramas have been removed, and I don't think that it's a big issue to just ignore the ones that are gone! There should still be plenty of images to train on. I wouldn't worry much about the missing images!
Now to address some of your previous comments:
Is the bottom left corner of a pano always cordinate 0,0 in px? Therefore the top right corner becomes 16384,8192 or 13312,6656 depending on the city.
Not quite, top left corner is 0,0 (which is standard when talking about images). And the width of the image isn't city-dependent (despite what our code used to do). It's on an image-by-image basis.
I'm aware it's quite an old post but based on @kavidey's https://github.com/ProjectSidewalk/SidewalkWebpage/issues/1708#issuecomment-505623474
I would not bother to read that post, by reading that, you are getting farther from the truth... You described what sv_x/y
, photographer_heading
, and heading
based on that post, and most (if not all) of those are incorrect. I believe that I've defined those elsewhere, so I would use those definitions.
Most notably, I defined photographer_heading
earlier in this thread as follows, so I am going to skip over the rest of your post asking a ton of stuff that only makes sense under a different definition:
The photographer_heading is the heading (with respect to true north) of the camera when the image was taken. So it is the heading of the center of the entire panorama image.
Got that. So true horizon https://github.com/ProjectSidewalk/SidewalkWebpage/issues/1708#issuecomment-505623474 of the pano could possibly be the reason of substracting half the pano's heigth (I guess you meant height), right?
Yes, I meant height! And yes, that's why we were subtracting half the pano's height. But again, this is no longer something that needs to be done after the recent updates :)
Issue https://github.com/ProjectSidewalk/sidewalk-panorama-tools/pull/10: pitch
Neither do I know what's this other variable. Is it:
As a reminder, camera_pitch
(previously called photographer_pitch
) is the pitch of the camera w/r/t horizontal. Then pitch
is the pitch of the Explore view of the user when they labeled, w/r/t the camera's pitch. An example: if camera_pitch
is 5 degrees and pitch
is -2 degrees, then the pitch of the Explore view w/r/t horizontal would be 3 degrees.
Issue https://github.com/ProjectSidewalk/sidewalk-panorama-tools/pull/15: Outdated Code in CropRunner.py
In CropRunner, the function
predict_crop_size(sv_image_y)
is called to get the proper perspective scale depending on the distance, right? ...
I would agree with that. You went in to some detail asking about the rationale for how predict_crop_size()
was written, and why it's different from our lat/lng estimation code. The truthful answer is that I have no idea what exactly the author was trying to do when they wrote this code. I honestly don't really understand how it works, so I'm not sure how I would modify it. I added a comment to that function saying as much.
Ultimately for me, however this is being done seemed to work decently well. I have no idea how it works, and from what I can tell, most people who have used our code don't really seem to care how that part works? I imagine that it would take some real work for me to rewrite this function such that it actually makes sense, and unfortunately I just have too many other important things to be worrying about :pensive:
I would welcome a rewrite of this function in a principled way, with an analysis comparing it's performance to our old method to prove that it is an improvement. But I don't have the cycles to do this myself, unfortunately!
Okay, I believe that I've gone through and answered all of your questions! Some I skipped because they are no longer relevant after the API updates. And some I skipped because there was a ton that was based on going down the rabbit hole of assuming that panos are always centered on true north (which they are not).
Let me know if there is anything that's still confusing! I'm really hoping that this is all less overwhelming after the API updates and after I cleaned out a bunch of old code in this repository. I also want to add definitions for the variables like camera_heading
, etc. in the README. I'll post when I've done that too so that you have that as an easy reference.
Okay I updated the README in this commit. Hopefully that's helpful!!
@PabloMoret the DC API is now up-to-date as well!
Yes, these should be up-to-date!
Nice!
Correct,
sv_image_x/y
is still incorrect in DC, and the names of the variables haven't been changed, etc. But I wouldn't bother thinking about the state that it's in now. I expect it to be up-to-date by the end of the week. The code has been fixed, it's a matter of deploying the code to the server where we're having a problem.
Thanks a lot!
Note that DC was our pilot deployment years ago, before we added an interface for validations. So we don't actually have validation information in DC. As such, DC is probably the worst city to test on sweat_smile That should probably be noted in the documentation as well...
Ok, got it. Didn't know DC was a pilot but makes sense.
You had generously offered in #25 to make this update and submit a PR.
Sure, I'll be happily working on cropRunner
now.
... Is this something that you would be willing to do now instead of me getting a CSV together? There is a small sample CSV that you can test with that is part of the repo here. And you shouldn't need a CSV with a full dataset for a city to be able to test out the migration over to using an API endpoint. I'm just very busy and would need to rewrite the query to get the CSV... And the point of using the API is so that I don't have to do that grin
Actually you're totally right. Did couple of tries with cvMetadata on Postman, and for some reason I haven't found yet it didn't work (had a "proxy error"). However I tried on chrome and it works perfectly fine. My fault :sweat_smile: Apologies for that.
Some panoramas have been removed, and I don't think that it's a big issue to just ignore the ones that are gone! There should still be plenty of images to train on. I wouldn't worry much about the missing images!
Not a problem. As you say I'll probably be able to train with ~50k panos.
Not quite, top left corner is 0,0 (which is standard when talking about images). And the width of the image isn't city-dependent (despite what our code used to do). It's on an image-by-image basis.
Alright. Now I understand most of the math in cropRunner
has changed.
I would not bother to read that post, by reading that, you are getting farther from the truth... You described what
sv_x/y
,photographer_heading
, andheading
based on that post, and most (if not all) of those are incorrect. I believe that I've defined those elsewhere, so I would use those definitions.
Here in this issue you've defined most of them. Maybe I dug deeper needlessly.
Most notably, I defined
photographer_heading
earlier in this thread as follows, so I am going to skip over the rest of your post asking a ton of stuff that only makes sense under a different definition:
Ok. I guess everything boils down to "true north". Almost any variable in metadata is referenced w.r.t true north.
Yes, I meant height! And yes, that's why we were subtracting half the pano's height. But again, this is no longer something that needs to be done after the recent updates :)
So updates now just consider pano_x
& pano_y
as the only values to compute crops. All the logic behind lies in SidewalkWebpage.
Just a comment: that was @robertyoung2's implementation, so maybe he knows how it works.
As a reminder,
camera_pitch
(previously calledphotographer_pitch
) is the pitch of the camera w/r/t horizontal. Thenpitch
is the pitch of the Explore view of the user when they labeled, w/r/t the camera's pitch. An example: ifcamera_pitch
is 5 degrees andpitch
is -2 degrees, then the pitch of the Explore view w/r/t horizontal would be 3 degrees.
Ok, now I understand! :smile:
I would agree with that. You went in to some detail asking about the rationale for how
predict_crop_size()
was written, and why it's different from our lat/lng estimation code. The truthful answer is that I have no idea what exactly the author was trying to do when they wrote this code. I honestly don't really understand how it works, so I'm not sure how I would modify it. I added a comment to that function saying as much.
Ok, exactly as my comment above. Luckily it works so far.
Ultimately for me, however this is being done seemed to work decently well. I have no idea how it works, and from what I can tell, most people who have used our code don't really seem to care how that part works? I imagine that it would take some real work for me to rewrite this function such that it actually makes sense, and unfortunately I just have too many other important things to be worrying about :pensive:
Don't worry. I totally understand how busy you are. So, it's better to leave it as it is as long as it works correctly.
I would welcome a rewrite of this function in a principled way, with an analysis comparing it's performance to our old method to prove that it is an improvement. But I don't have the cycles to do this myself, unfortunately!
No problem :relaxed:
Okay, I believe that I've gone through and answered all of your questions! Some I skipped because they are no longer relevant after the API updates. And some I skipped because there was a ton that was based on going down the rabbit hole of assuming that panos are always centered on true north (which they are not).
Just created that issue to have a deeper understanding of how previous version of cropRunner
used to work and to possibly help in this and other related repos. Since this update, maybe I don't need to dig deeper yet. However it would be cool to understand everything but I admit is time consuming and it's not worth it a.t.m.
Let me know if there is anything that's still confusing! I'm really hoping that this is all less overwhelming after the API updates and after I cleaned out a bunch of old code in this repository. I also want to add definitions for the variables like
camera_heading
, etc. in the README. I'll post when I've done that too so that you have that as an easy reference.
Good! I think it's ok for now. Don't want to bother you.
Thanks for taking some time to discuss this issue! :smile:
Panorama-tools Questions & Issues
First of all I apologize for my english, it's not my first language. Please let me know if anything's wrong or undertandable.
OverView
The purpose of this issue is to clarify some names and concepts used in panorama-tools repo and ProjectSidewalk in general.
I'm currently working on my Bachelor's thesis and found this incredible project. My goal is to get the whole dataset (all label crops) to do a further research on Deep Learning (exploring new nets and models that best fit this project for labeling and validating). Similar to Deep Learning for Sidewalk Assessment. However, there are some unclear things and concepts to me that I'd like to cover in order to better understand the whole code and to be sure it works as expected.
The main issue is that
CropRunner.py
hasn't been thoroughly tested. I'm reading the whole code and all related issues regarding panorama-tools to check whether the crops are done as expected and also to have a big overview. Plenty of questions have arised from it. These issues and clarifications may also be useful to anyone reading this code.Issue #1: Version of CropRunner.py
According to the project instructions, CropRunner is somehow outdated (actually the last commit was one year ago) and you are currently working on a new version. I'd like to know how's this update going. I may have the time to help in developing but I first need deeper understanding of the concepts described below. Depending on your decision to update CropRunner and the time available to complete my thesis, I might either tweak the current version for myself and use it almost as it is or update the code for future use.
Issue #2: Metadata and Depth Information
As far as I understand, both metadata and depth data were/are used to:
And also:
Is that true? Or was metadata specifically used for computing lat/lng and separately depth info specifically to crop panos?
I know nowadays the only source available is a database containing just metadata information in csv format at cs.washington.edu which has a structure similar to or the same as labeldata.csv due to google decision to take the endpoint down.
Issue #3: Ground Truth
In the issue #2374 you talk about the ground truth. By googling it, says it's data or information to be real or true. However, I just wanted to make sure has nothing to do with GSV ground lat/lng position. I guess when you talk about ground truth you just mean lat/lng and GSV info gathered directly via google API. No info inferred from it.
Issue #4: canvas_x & canvas_y
As mentioned in this comment
canvas_x
andcanvas_y
represent the coordinates (x & y values) of the label within the canvas (720x480 exploring box of a pano: cropped pic where the tag's placed) in ProjectSidewalk web page.Mention the use of them here.
Note: See picture in Issue #5.
Issue #5: sv_image_x & sv_image_y
As stated in this comment
sv_image_x
&sv_image_y
are the coordinates (x & y values) of the label within the pano.Therefore,
canvas_x
&canvas_y
andsv_image_x
&sv_image_y
are essentially the same (equivalent) but in different systems: x & y position values in canvas and pano respectively.Note: I assumed (based on CropRunner and how labels are placed using ProjectSidewalk web labeling tool) any label is not strictly in the middle of the crop. Please let me know if otherwise.
Issue #6: distance
Mentioned here to be used when cropping at different scale.
Computed as mentioned in this gist.
As far as I undestand (again refering to the same comment as above), the
distance
is the segment between the GSV car that took the pano (actually the lat/lgn position of the pano) and the label itself. Is it?That distance was computed using depthmap, but not anymore.
Issue #7: heading
As fas as I get its the value in the x axis but in sexagesimal system, i.e. in degrees e.g. 192.1281378163º. However, I have plenty of doubts here:
Is it the x position of the label?![Canvas label](https://user-images.githubusercontent.com/43816464/225960764-c650e474-3f5b-4d87-955d-af1d4a2b8209.svg)
Is it the x position of the center of the canvas (720x480) instead?![Canvas Center](https://user-images.githubusercontent.com/43816464/225960909-f119baf0-e560-45c8-bec1-3e710b51687f.svg)
What's the difference between the heading in samples/labeldata.csv, cv-metadata-seattle.csv or any other csv file containing metadata AND the heading computed with the regression method (
difference in heading
).I might see
heading
in the regression method is infered, not ground truth, but still don't understand why it's needed and why you need to compute it withcanvas_x
instead of just using the originalheading
got from Sidewalk.Issue #8: photograhper heading
Honestly I have no idea what's this variable, what has to do with simple
heading
and the rest of the variables.Analyzing code in
CropRunner.py
there's:Why would photograhper (GSV camera) and viewers (people accessing google street view in maps) have different heading? When someone's using GSV, he's/she's essentially viewing the original pano, right?
Why using two different
heading
s: standard and photographer?Also used here:
Issue #9: PanoYawDegree
It's mentioned in Rob's comment and used in
CropRunner.py
:Regardless the previous questions about
heading
, what's the reason behind substractingphotographer_heading
to $180$ and then using it to calculate label x position? Is it becausephotographer_heading
is displaced? Why not just usingphotographer_heading
orheading
directly?There's a deeper question/discussion in issue #15
Issue #10: pitch
Neither do I know what's this other variable. Is it:
Issue #11: im_with & im_heigh
I guess they're just the x,y values of the pano size. Mentioned them here: comment1, comment2. Used them in
CropRunner.py
.Issue #12: lat/lng estimation method
I've got some questions regarding the regression method, but maybe can be covered later, once the rest of the issues have been resolved.
link to the regression method.
Issue #13: image_width & image_heigh incorrect values
Regarding the image and measure values. According to #2485 the
sv_image_width
andsv_image_heigh
in database are not correct. Now:So two issues are been disscussed here:
sv_image_x
andsv_image_y
are wrong.image_width
andimage_heigh
are wrong as well, aparently in Seattle.#2485 is still open since Mar 15 2022 but, as far as I get,
im_width
andim_heigh
are "already" solved using SidewalkWebPage API mentioned here in this pull request.sv_image_x
andsv_image_y
. I think those are fixed inCropRunner.py
using the ~$1.23$ scaling factor since panos have been "enlarged" from 13k to 16k. Hence the previous oldsv_image_x
&sv_image_y
in 13k SV pano reference now have to be "enlarged" too.Now, why this issue's still open?
photographer_heading
andphotographer_pitch
are moved by ~1 to 4 degrees?Regarding this comment some coordinates might be negative. Why is it? Due to the 180 degrees (mentioned here in issue #15)?
Here just justifying point 2. Regarding some comments like this one in panorama-tools:#8 and #2814 I guess variables
sv_image_x
andsv_image_y
needed to be multiplied by scaling factor ~1.23 to be consistent with the big pano size in seattle. My point here is that imho Rob was right because probably label positionssv_image_x
andsv_image_y
were outdated due to the old pano size. Once new big 16k+ 8k+ panos were released by Google, then multiplying by scaling factor was needed to get the consistent values. And that's why imho justifies @robertyoung2's comment.Therefore, are all labels with this problem already fixed for cs.washington.edu API?
Issue #14: Labels differ in position when far away
Could the scale factor of ~$1.23$ be a legitimate fix for #2478?
Or could this have relation with this comment, the problem with being off in y axis?
I don't understand cause a different issue regarding several labels pointing to the same problem is withing the metatada #8 issue but no further research has been done: no comments talk about this problem.
Issue #15: Outdated Code in CropRunner.py
This issue covers some of the missing formulas and concepts in
CropRunner.py
. This is from my personal perspective.In
CropRunner
, the functionpredict_crop_size(sv_image_y)
is called to get the proper perspective scale depending on the distance, right? As far as I'm concern, a closer tag might need a bigger crop to get the surroundings so DeepLearning apps are "aware" of the context, and the further ones may need a smaller crops since objects and surroundings are closer near the vanishing point.However, some concerns here:
distance
is computed just once inpredict_crop_size
, whereas label-latlng-estimation specifies 3 different formulas of distance depending on zoom level. Should the code compute a specific crop size depending on the pano's zoom level?distance
computed inpredict_crop_size
just usingsv_image_y
, as opposed to the one in gist, which uses a different formula withsv_image_y
andcanvas_y
. Is there any procedure or formula transformation from gist to code? Apparently thecanvas_y
factor is so small so I guess you've omitted it.However,
Non of the formulas on label-latlng-estimation matches the one in
CropRunner.py
:vs
heading
formula in gist usescanvas_x
. However, the only value used in code isphotographer_heading
(already covered in #8):Then what is
difference in heading
used for?How
x
andy
values are computed?I mean, I understand PanoYawDeg is sexagesimal value so needs to be transformed by dividing it to 360. But
sv_image_x
andsv_image_y
already define where the label lies in the pano. What's the sense in computingx
andy
as above? As long as the scaling factor works fine, there would be no need in the formula above.Issue #16: metadata csv and api call for pano size and sv_image values for distance
This question is specifically related to labels. Apparently there are two different files containing data as to label variables:
cv-metadata-city.csv
and csv retrieved from v2 api onsidewalk-sea.cs.washington.edu
. But each one has some caveats:sv_image_x
&sv_image_y
used to compute distance inCropRunner.py
. However, data onsidewalk-sea.cs.washington.edu
contains a frequently updated database of labels but nosv_image_x
,sv_image_y
offered. Is there any transformation fromcanvas_x
,canvas_y
tosv_image
variables so updated database data can be used?sv_image_x/y
variables are needed but there's no transformation at all, can I have an updated csv file to useCropRunner.py
once I've got all pano dataset downloaded?crop_box_helper
andmake_single_crop
? Both crop panos but the first one seems to focus on label type 2.Other Issues
Just to be sure and so it's useful to other people using these Project repos and API:
https://sidewalk-city.cs.washington.edu/adminapi/labels/panoid
&https://sidewalk-city.cs.washington.edu/adminapi/panos
is exactly the same. reference