Closed taz4270 closed 5 years ago
There is something inside of Browser.CreatePageInContextAsync
or a property missing, that doesn't allow the program to proceed
Because I've been searching your code inside-out and even tried to find a result, I tried to make a method to find how deep i could go to find more precise as I could:
private static async Task<Page> SepareteContext(Browser browser){
var browserContextArray = browser.BrowserContexts();
var browserContext = browserContextArray.LastOrDefault();
if (browserContext is null)
return null;
var page = await browserContext.NewPageAsync(); //Even tried with .ConfigureAwait(false)
return page;
}
The bug occurrs when page is awaiting for browserContext.NewPageAsync()
, it never has the anwser back
PS: this is inside NewPage
Now
What is NewPage
doing?
NewPage
is like NewPageAsync
but synchronous
@taz4270 how are you making that synchronous?
Sort of, it's a "weird specie of a bird", That is synchronous but it waits for Tasks to be executed
I bet you're are locking the thread preventing Puppeteer from getting messages from chromium, you should await all the way.
I'm not sure about that, the Task is the responsable to get the message, besides, i already did it once, and even got the anwser, but now, i migrated to my main testbench and it blew up, and I can't even access the other IIS anymore, it doesnt allow me xD
I can't do full async
because most of my program doesn't support it
That's why I'm asking how are you waiting for the task. If you do browser.NewPage().Result
it'll just not going to work.
of course not. but Task.Run(()=>browser.NewPageAsync()).Result
do, but not on ISS, trust me, i did it on windows forms but i cant do it on IIS xD
You should never use .Result
There are many documents talking about this https://msdn.microsoft.com/en-us/magazine/jj991977.aspx
You can wrap it inside a Task.Run but you should wrap your entire functionality not just pieces. Puppeteer is async by definition there is no way to change that.
I will try, but how can you explain me is it working on windows forms, and all other methods work fine as a line?
@taz4270 ASP.NET and Winforms don't share the same synchronization context. This is a good reading https://msdn.microsoft.com/en-us/magazine/gg598924.aspx
Ok, thanks, now i have to make everything from the ground i think 👍
Nop i was right, its was not my fault, i did everything, in IIS, async
, but still locks there. At the same point as before Page page = await browser.NewPageAsync();
@taz4270 can you share some testable code?
I will try tomorrow, but today i cant
So, as you know there is an issue in browser.NewPageAsync
and here is the context
public class HeadlessBrowser
{
private static readonly string HostPath = HostingEnvironment.MapPath("~/App_Data/");
private string ExecutionPath { get; set; }
private Page CurrentPage { get; set; }
public string Link { get; set; }
public HeadlessBrowser(string link)
{
Link = link;
}
public async Task Setup()
{
ExecutionPath = await GetExecutionPathAsync();
CurrentPage = await GetPageAsync();
}
private static BrowserFetcher GetBrowserFetcher() => new BrowserFetcher(new BrowserFetcherOptions()
{
Path = HostPath
});
private static async Task<string> GetExecutionPathAsync()
{
BrowserFetcher browserFetcher = GetBrowserFetcher();
string ExecutionPath = browserFetcher.GetExecutablePath(BrowserFetcher.DefaultRevision);
if (!File.Exists(ExecutionPath))
await browserFetcher.DownloadAsync(BrowserFetcher.DefaultRevision);
return ExecutionPath;
}
private async Task<Browser> GetBrowserAsync => await Puppeteer.LaunchAsync(new LaunchOptions()
{
Headless = true,
ExecutablePath = ExecutionPath
});
private async Task<Page> GetPageAsync()
{
return await GetNavigatedPageAsync(await GetBrowserAsync());
}
private async Task<Page> GetNavigatedPageAsync(Browser browser)
{
try
{
Page page = await browser.NewPageAsync(); //Turns into a vegetable
await page.GoToAsync(Link,30000);
}
catch (Exception e)
{
//It never comes to this part
Debug.WriteLine(e);
throw e.GetBaseException();
}
}
public async Task<Stream> GetScreenshotStreamAsync() => await CurrentPage.ScreenshotStreamAsync();
//Thats all you need to know about this class
}
Now the WebApi
[RoutePrefix("api")]
public class ThumbnailController : BaseApiController (Descendent of ApiController)
{
[HttpGet,Route("thumb/v1")]
public async Task<IHttpActionResult> ThumbnailAsync(string address)
{
HeadlessBrowser browser = new HeadlessBrowser(address);
await browser.Setup();
Stream image = await browser.GetScreenshotStreamAsync();
return Ok(image != null? "ok" : "not ok");
}
}
https://github.com/kblok/puppeteer-sharp/blob/master/lib/PuppeteerSharp/BrowserContext.cs
Line 74 - public Task<Page> NewPageAsync() => Browser.CreatePageInContextAsync(_id);
Was this supposed to not have an await
and async
?
@taz4270 we use Task cascading a lot. Instead of awaiting the task we just return it, until someone really needs to await it. This is a good read https://weblogs.asp.net/sfeldman/task-vs-async-task
That's the awnser for last question. What about the code above?
Even that, in this situation, it needs to return the page or it will wait forever
@taz4270 I just created a project with your code and it worked.
@kblok . We have a dotnet core 2.1 application running on IIS 7.5 using puppeteer-sharp 1.9. The server is windows 2008 R2.
When is called await browser.NewPageAsync(), we received a timeout even using configureawait false.
Any suggestion for us?
Thanks.
@willsbctm I'll take a look running it on IIS.
@willsbctm @taz4270 reproduced guys.
Wait reproduced the bug or did you successfully executed? I got confused
@taz4270 I'm getting the same issue on IIS
@kblok any update about the bug? Whats the issue or what do you think it is?
@taz4270 we are getting a deadlock when we use the session for the very first time https://github.com/kblok/puppeteer-sharp/blob/master/lib/PuppeteerSharp/Page.cs#L1520
The thread is being stuck there, and it doesn't let puppeteer keep listening to the WebSocket to get the answer to that command. No joy yet.
Guys, I found a similar issue for puppeteer. https://github.com/GoogleChrome/puppeteer/issues/1262
On this case the workaround was adding --no-sandbox on chromium, is this a "solution"?
@lucascebertin I don't think so. I read a little bit about this issue and the problem is that ASP.NET thread management. I can't confirm it yet, but it seems that ASP.NET is assigning only one thread per request and that deadlocks the execution because it doesn't allow the connection (which is using Task.Run
to listen to new messages.
I bet I'll be working on this after I release 1.10.
@taz4270 @willsbctm could you try v1.10?
I tried the v1.10, with --no-sandbox works, without it looks like the deadlock is still there.
Don't know if this log can help, but that's what I got:
`trce: PuppeteerSharp.Connection[0] Send 1 Method Target.setDiscoverTargets Params { discover = True }
trce: PuppeteerSharp.Connection[0] ? Receive {"method":"Target.targetCreated","params":{"targetInfo":{"targetId":"3276034a-b7f0-462b-af30-c26a3fe05f40","type":"browser","title":"","url":"","attached":false}}}
trce: PuppeteerSharp.Connection[0] ? Receive {"method":"Target.targetCreated","params":{"targetInfo":{"targetId":"5EACC14CD88992030B76F44B4729FDDD","type":"page","title":"","url":"about:blank","attached":false,"browserContextId":"6E425A78148B53E4686790E8F708EADA"}}}
trce: PuppeteerSharp.Connection[0] ? Receive {"method":"Target.targetCreated","params":{"targetInfo":{"targetId":"b97ca89b-6136-4bad-8de2-8aa0722e3820","type":"browser","title":"","url":"","attached":true}}}
trce: PuppeteerSharp.Connection[0] ? Receive {"id":1,"result":{}}
trce: PuppeteerSharp.Connection[0] Send 2 Method Target.createTarget Params [url, about:blank]
trce: PuppeteerSharp.Connection[0] ? Receive {"method":"Target.targetCreated","params":{"targetInfo":{"targetId":"F5E5D9799EEDBC8CE189DE8BE015AD9A","type":"page","title":"","url":"","attached":false,"browserContextId":"6E425A78148B53E4686790E8F708EADA"}}}
trce: PuppeteerSharp.Connection[0] ? Receive {"id":2,"result":{"targetId":"F5E5D9799EEDBC8CE189DE8BE015AD9A"}}
trce: PuppeteerSharp.Connection[0] ? Receive {"method":"Target.targetInfoChanged","params":{"targetInfo":{"targetId":"F5E5D9799EEDBC8CE189DE8BE015AD9A","type":"page","title":"","url":"about:blank","attached":false,"browserContextId":"6E425A78148B53E4686790E8F708EADA"}}}
trce: PuppeteerSharp.Connection[0] Send 3 Method Target.attachToTarget Params { targetId = F5E5D9799EEDBC8CE189DE8BE015AD9A }
trce: PuppeteerSharp.Connection[0] ? Receive {"method":"Target.targetCrashed","params":{"targetId":"F5E5D9799EEDBC8CE189DE8BE015AD9A","status":"failed to launch","errorCode":18}}
trce: PuppeteerSharp.Connection[0] ? Receive {"method":"Target.targetInfoChanged","params":{"targetInfo":{"targetId":"F5E5D9799EEDBC8CE189DE8BE015AD9A","type":"page","title":"","url":"about:blank","attached":true,"browserContextId":"6E425A78148B53E4686790E8F708EADA"}}}
trce: PuppeteerSharp.Connection[0] ? Receive {"method":"Target.attachedToTarget","params":{"sessionId":"87634875ECAD81B62CDC1B6CDACEA827","targetInfo":{"targetId":"F5E5D9799EEDBC8CE189DE8BE015AD9A","type":"page","title":"","url":"about:blank","attached":true,"browserContextId":"6E425A78148B53E4686790E8F708EADA"},"waitingForDebugger":false}}
trce: PuppeteerSharp.Connection[0] ? Receive {"id":3,"result":{"sessionId":"87634875ECAD81B62CDC1B6CDACEA827"}}
trce: PuppeteerSharp.CDPSession[0] Send 1 Method Page.enable Params (null)
trce: PuppeteerSharp.Connection[0] Send 4 Method Target.sendMessageToTarget Params [sessionId, 87634875ECAD81B62CDC1B6CDACEA827], [message, {"id":1,"method":"Page.enable","params":null}]
trce: PuppeteerSharp.Connection[0] ? Receive {"id":4,"result":{}}`
@lucascebertin any reason not to use --no-sandbox
? For what I read in Puppeteer is something that users end up setting it when they run with restricted permissions.
No, I think it's ok in our scenario set --no-sandbox
, we are just creating some PDFs and inside an very controlled environment.
I just thought a little weird the deadlock without the arg being passed but just because I got curious to really understand the problem haha
Would be a good thing consider to add this magic
argument on README?
Agreed @lucascebertin. I'm also working on a better connection transport for AspNet Full Framework. As it's not recommended to use Task.Run
there.
That's great. By the way, thanks for your help on it :blush:
Unfortunatly, i'm currently too busy to do the experience, maybe i have some time over the weekend, but until then, i dont think so
We just release a new package for ASP.NET Framework https://www.nuget.org/packages/PuppeteerSharp.AspNetFramework/
Here's one example https://github.com/kblok/puppeteer-sharp/blob/master/samples/PupppeterSharpAspNetFrameworkSample/PupppeterSharpAspNetFrameworkSample/Controllers/TestController.cs
Give it a try. If you have any issues let's open another issue.
@willsbctm did you figure out how to run it on windows server 2008 R2/Net Core ? @kblok whats the fix thats added in PuppeteerSharp.AspNetFramework? this doesn't work with net core, need a fix for netcore application.
@sudheerkaku some reading here https://github.com/kblok/puppeteer-sharp/pull/776
Basically, we shouldn't use Task.Run on ASP.NET. But I can't say that this is THE solution.
@kblok I'm not using Task.run public static void Main(string[] args) { SavePDFUsingPuppeteerSharp(filePath, pdfFilePath).Wait(); } public static async Task SavePDFUsingPuppeteerSharp(string htmlFilePath, string pdfFilePath) { await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultRevision); var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = true }); var page = await browser.NewPageAsync(); PdfOptions pdfOptions = new PdfOptions(); pdfOptions.PreferCSSPageSize = true; await page.GoToAsync(htmlFilePath, WaitUntilNavigation.Load); await page.PdfAsync(pdfFilePath, pdfOptions); }
@sudheerkaku but the Puppeteer-Sharp connection does. PuppeteerSharp.AspNetFramework uses HostingEnvironment.QueueBackgroundWorkItem
ah got it, PuppeteerSharp.AspNetFramework is for .netframework unfortunately i'm using .netcore.. so should I use the code and make the changes?
Sorry @sudheerkaku I missed that part. I'm locking the conversation on this issue. If you have any issues feel free to create a new one with the steps to reproduce it.
browser.NewPageAsync() dies in IIS same problem as #375
Hello again, today I found a issue that was asked before on #375, the
browser.NewPageAsync
is not returning anything, don't know if there is an argument missing (probably not), or there was something that wasn't right at the master code, so i tried to find a solution, i've kinda made it work but unfortunatly, i can't find how can i open the other ISS' port (something like that but this part doesn't matter, just know that it worked), and when i tried to migrate the code the other side, it blow upThere is some part of my code
Remember, Im using my addon, so any method missing
Async
, is my doing